快速入门¶
在本地快速启动并运行大型语言模型。
安装¶
macOS¶
Windows preview¶
Linux¶
Docker¶
The official Ollama Docker image ollama/ollama
is available on Docker Hub.
库¶
快速开始¶
运行并与 Llama 3 对话:
模型库¶
Ollama 支持在 ollama.com/library 上获取的模型列表
以下是一些可下载的示例模型:
Model | Parameters | Size | Download |
---|---|---|---|
Llama 3 | 8B | 4.7GB | ollama run llama3 |
Llama 3 | 70B | 40GB | ollama run llama3:70b |
Mistral | 7B | 4.1GB | ollama run mistral |
Dolphin Phi | 2.7B | 1.6GB | ollama run dolphin-phi |
Phi-2 | 2.7B | 1.7GB | ollama run phi |
Neural Chat | 7B | 4.1GB | ollama run neural-chat |
Starling | 7B | 4.1GB | ollama run starling-lm |
Code Llama | 7B | 3.8GB | ollama run codellama |
Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
Llama 2 13B | 13B | 7.3GB | ollama run llama2:13b |
Llama 2 70B | 70B | 39GB | ollama run llama2:70b |
Orca Mini | 3B | 1.9GB | ollama run orca-mini |
LLaVA | 7B | 4.5GB | ollama run llava |
Gemma | 2B | 1.4GB | ollama run gemma:2b |
Gemma | 7B | 4.8GB | ollama run gemma:7b |
Solar | 10.7B | 6.1GB | ollama run solar |
注意:运行 7B 模型至少需要 8 GB 的 RAM,运行 13B 模型需要 16 GB,运行 33B 模型需要 32 GB。
定制模型¶
从 GGUF 导入¶
Ollama 支持在 Modelfile 中导入 GGUF 模型:
-
创建一个名为
Modelfile
的文件,其中包含一个指向你想导入模型的本地文件路径的FROM
指令。 -
在 Ollama 中创建模型
-
运行模型
从 PyTorch 或 Safetensors 导入¶
查看导入模型的指南以获取更多信息。
定制提示词¶
可以使用来自 Ollama 库的模型进行提示定制。例如,要定制 llama3
模型:
创建一个 Modelfile
:
FROM llama3
# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
接下来,创建并运行模型:
有关更多示例,请参阅 示例 目录。有关使用 Modelfile 的更多信息,请查看 Modelfile 文档。
CLI 参考¶
创建模型¶
使用 ollama create
从 Modelfile 创建模型。
拉取模型¶
此命令也可以用于更新本地模型。只会拉取差异部分。
删除模型¶
复制模型¶
多行输入¶
对于多行输入,可以使用 """
包围文本:
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
多模态模型¶
>>> What's in this image? /Users/jmorgan/Desktop/smile.png
The image features a yellow smiley face, which is likely the central focus of the picture.
作为参数传递提示¶
$ ollama run llama3 "Summarize this file: $(cat README.md)"
Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
列出计算机上的模型¶
启动 Ollama¶
使用 ollama serve
在不运行桌面应用程序的情况下启动 ollama。
构建¶
安装 cmake
和 go
:
然后生成依赖项:
接着构建二进制文件:
更详细的指南可以在 开发者指南 中找到。
运行本地构建¶
接下来,启动服务器:
最后,在另一个 shell 中运行模型:
REST API¶
Ollama 有一个用于运行和管理模型的 REST API。
生成响应¶
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":"Why is the sky blue?"
}'
与模型聊天¶
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'
查看 API 文档了解所有端点。
社区集成¶
Web 和桌面端¶
- Lollms-Webui
- LibreChat
- Bionic GPT
- Enchanted (macOS native)
- HTML UI
- Saddle
- Chatbot UI
- Typescript UI
- Minimalistic React UI for Ollama Models
- Open WebUI
- Ollamac
- big-AGI
- Cheshire Cat assistant framework
- Amica
- chatd
- Ollama-SwiftUI
- Dify.AI
- MindMac
- NextJS Web Interface for Ollama
- Msty
- Chatbox
- WinForm Ollama Copilot
- NextChat with Get Started Doc
- Alpaca WebUI
- OllamaGUI
- OpenAOE
- Odin Runes
- LLM-X: Progressive Web App
- AnythingLLM (Docker + MacOs/Windows/Linux native app)
- Ollama Basic Chat: Uses HyperDiv Reactive UI
- Ollama-chats RPG
- ChatOllama: Open Source Chatbot based on Ollama with Knowledge Bases
- CRAG Ollama Chat: Simple Web Search with Corrective RAG
- RAGFlow: Open-source Retrieval-Augmented Generation engine based on deep document understanding
终端¶
- oterm
- Ellama Emacs client
- Emacs client
- gen.nvim
- ollama.nvim
- ollero.nvim
- ollama-chat.nvim
- ogpt.nvim
- gptel Emacs client
- Oatmeal
- cmdh
- ooo
- tenere
- llm-ollama for Datasette's LLM CLI.
- typechat-cli
- ShellOracle
- tlm
数据库¶
- MindsDB (Connects Ollama models with nearly 200 data platforms and apps)
- chromem-go with example
包管理¶
库¶
- LangChain and LangChain.js with example
- LangChainGo with example
- LangChain4j with example
- LlamaIndex
- LiteLLM
- OllamaSharp for .NET
- Ollama for Ruby
- Ollama-rs for Rust
- Ollama4j for Java
- ModelFusion Typescript Library
- OllamaKit for Swift
- Ollama for Dart
- Ollama for Laravel
- LangChainDart
- Semantic Kernel - Python
- Haystack
- Elixir LangChain
- Ollama for R - rollama
- Ollama-ex for Elixir
- Ollama Connector for SAP ABAP
- Testcontainers
移动端¶
扩展与插件¶
- Raycast extension
- Discollama (Discord bot inside the Ollama discord channel)
- Continue
- Obsidian Ollama plugin
- Logseq Ollama plugin
- NotesOllama (Apple Notes Ollama plugin)
- Dagger Chatbot
- Discord AI Bot
- Ollama Telegram Bot
- Hass Ollama Conversation
- Rivet plugin
- Llama Coder (Copilot alternative using Ollama)
- Obsidian BMO Chatbot plugin
- Cliobot (Telegram bot with Ollama support)
- Copilot for Obsidian plugin
- Obsidian Local GPT plugin
- Open Interpreter
- twinny (Copilot and Copilot chat alternative using Ollama)
- Wingman-AI (Copilot code and chat alternative using Ollama and HuggingFace)
- Page Assist (Chrome Extension)
- AI Telegram Bot (Telegram bot using Ollama in backend)
- AI ST Completion (Sublime Text 4 AI assistant plugin with Ollama support)
支持的后端¶
- llama.cpp project founded by Georgi Gerganov.