一. Ollama 搭建本地大模型
1. 安装 Ollama
参考 Ollama 官方文档
2. 安装 llama3
ollama run llama3
二. LiteLLM 使用
LiteLLM 有两种使用方式:
- OpenAI 代理服务器
- LiteLLM Python SDK
1. LiteLLM Python SDK
from litellm import completion
response = completion(
model="ollama/llama3",
messages = [{ "content": "Hello, how are you?","role": "user"}],
api_base="http://localhost:11434"
)
参考 LiteLLM - Getting Started
2. OpenAI 代理服务器
安装依赖
pip install litellm[proxy]
启动代理
litellm --model ollama/llama3
代码示例
def openai_usage():
# 使用 litellm 的 openai 代理
client = openai.OpenAI(api_key="anything", base_url="http://localhost:4000") # set proxy to base_url
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="ollama/llama3", messages=[
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])
print(response)