Optimising Dolphin 3

This guide explains how to create and run an optimised version of the Dolphin 3 model using Ollama.

Setup Commands

# Create optimised model
ollama create dolphin-optimised

# Run server with optimisations
OLLAMA_FLASH_ATTENTION=true OLLAMA_KV_CACHE_TYPE=f16 ollama serve

# Run the optimised model
ollama run --verbose dolphin-optimised:latest

# Or with OpenWebUI (Ensure Docker is running)
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Notes

For more system prompts - [https://github.com/cognitivecomputations/dolphin-system-messages]
For parameter settings - [https://github.com/ollama/ollama/blob/main/docs/modelfile.md]
Uses Flash Attention for better performance
Employs F16 KV cache type
Requires Ollama to be installed

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
modelfile		modelfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimising Dolphin 3

Setup Commands

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Optimising Dolphin 3

Setup Commands

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages