Often times not only do you want the base model but may also want the original response from the API. You can do this by retrieving the `raw_response`, since the `raw_response` is also a pydantic model, you can use any of the pydantic model methods on it.

```python
import instructor

from openai import OpenAI
from pydantic import BaseModel

client = instructor.patch(OpenAI())


class UserExtract(BaseModel):
    name: str
    age: int


user: UserExtract = client.chat.completions.create(
    model="gpt-3.5-turbo",
    response_model=UserExtract,
    messages=[
        {"role": "user", "content": "Extract jason is 25 years old"},
    ],
)

print(user._raw_response)
"""
ChatCompletion(
    id='chatcmpl-8pOAsSOIHAmngMBBki3BLN3p552L0',
    choices=[
        Choice(
            finish_reason='stop',
            index=0,
            logprobs=None,
            message=ChatCompletionMessage(
                content=None,
                role='assistant',
                function_call=FunctionCall(
                    arguments='{\n  "name": "Jason",\n  "age": 25\n}',
                    name='UserExtract',
                ),
                tool_calls=None,
            ),
        )
    ],
    created=1707258346,
    model='gpt-3.5-turbo-0613',
    object='chat.completion',
    system_fingerprint=None,
    usage=CompletionUsage(completion_tokens=16, prompt_tokens=73, total_tokens=89),
)
"""
```

!!! tip "Accessing tokens usage"

    This is the recommended way to access the tokens usage, since it is a pydantic model you can use any of the pydantic model methods on it. For example, you can access the `total_tokens` by doing `user._raw_response.usage.total_tokens`. Note that this also includes the tokens used during any previous unsuccessful attempts.

    In the future, we may add additional hooks to the `raw_response` to make it easier to access the tokens usage.