# Running a Local Ollama Model

Here are some instructions on using Ollamo and Litellm.

## Instructions

1. Install Ollama by visiting the website [https://ollama.ai/download](https://ollama.ai/download) and selecting the appropriate operating system.

2. Once installed, open the Ollama app, which should be running in your taskbar.

3. Open the terminal and download a model. For example, to download the llama2 model, run the command:

```bash
ollama run llama2
```

4. In your terminal, start your virtual environment and install the 'litellm[proxy]' package using poetry you can run the command:

```bash
pip install 'litellm[proxy]'
```

Then you should be able to patch using the wrap completion API.
Since it's just going to use regular prompting and not... Function Calling. You'll need to have a lot more instructions in the system message to ask it to output JSON.

```python
from litellm import completion, provider_list
from pydantic import BaseModel

import instructor
from instructor.patch import wrap_chatcompletion

completion = wrap_chatcompletion(completion, mode=instructor.Mode.MD_JSON)


class UserExtract(BaseModel):
    name: str
    age: int


user = completion(
    model="ollama/llama2",
    response_model=UserExtract,
    messages=[
        {
            "role": "system",
            "content": "You are a JSON extractor. Please extract the following JSON, No Talk.",
        },
        {
            "role": "user",
            "content": "Extract `My name is Jason and I am 25 years old`",
        },
    ],
)

assert isinstance(user, UserExtract), "Should be instance of UserExtract"
assert user.name.lower() == "jason"
assert user.age == 25
```