mirror of
https://github.com/kennethreitz/instructor.git
synced 2026-06-05 22:50:18 +00:00
60 lines
1.6 KiB
Markdown
60 lines
1.6 KiB
Markdown
# Running a Local Ollama Model
|
|
|
|
Here are some instructions on using Ollamo and Litellm.
|
|
|
|
## Instructions
|
|
|
|
1. Install Ollama by visiting the website [https://ollama.ai/download](https://ollama.ai/download) and selecting the appropriate operating system.
|
|
|
|
2. Once installed, open the Ollama app, which should be running in your taskbar.
|
|
|
|
3. Open the terminal and download a model. For example, to download the llama2 model, run the command:
|
|
|
|
```bash
|
|
ollama run llama2
|
|
```
|
|
|
|
4. In your terminal, start your virtual environment and install the 'litellm[proxy]' package using poetry you can run the command:
|
|
|
|
```bash
|
|
pip install 'litellm[proxy]'
|
|
```
|
|
|
|
Then you should be able to patch using the wrap completion API.
|
|
Since it's just going to use regular prompting and not... Function Calling. You'll need to have a lot more instructions in the system message to ask it to output JSON.
|
|
|
|
```python
|
|
from litellm import completion, provider_list
|
|
from pydantic import BaseModel
|
|
|
|
import instructor
|
|
from instructor.patch import wrap_chatcompletion
|
|
|
|
completion = wrap_chatcompletion(completion, mode=instructor.Mode.MD_JSON)
|
|
|
|
|
|
class UserExtract(BaseModel):
|
|
name: str
|
|
age: int
|
|
|
|
|
|
user = completion(
|
|
model="ollama/llama2",
|
|
response_model=UserExtract,
|
|
messages=[
|
|
{
|
|
"role": "system",
|
|
"content": "You are a JSON extractor. Please extract the following JSON, No Talk.",
|
|
},
|
|
{
|
|
"role": "user",
|
|
"content": "Extract `My name is Jason and I am 25 years old`",
|
|
},
|
|
],
|
|
)
|
|
|
|
assert isinstance(user, UserExtract), "Should be instance of UserExtract"
|
|
assert user.name.lower() == "jason"
|
|
assert user.age == 25
|
|
```
|