small fixes to tutorials (#168)

Co-authored-by: Jason Liu <jxnl@users.noreply.github.com>
This commit is contained in:
Francisco Ingham
2023-11-11 19:00:31 -08:00
committed by GitHub
parent df20cb4d16
commit 1ec9114d61
8 changed files with 109 additions and 43 deletions
+37 -12
View File
@@ -45,22 +45,47 @@ assert user.name == "Jason"
assert user.age == 25
```
!!! note "Using `openai<1.0.0`"
**"Using `openai<1.0.0`"**
If you're using `openai<1.0.0` then make sure you `pip install instructor<0.3.0`
where you can patch a global client like so:
If you're using `openai<1.0.0` then make sure you `pip install instructor<0.3.0`
where you can patch a global client like so:
```python hl_lines="4 8"
import openai
import instructor
```python hl_lines="4 8"
import openai
import instructor
instructor.patch()
instructor.patch()
user = openai.ChatCompletion.create(
...,
response_model=UserDetail,
)
```
user = openai.ChatCompletion.create(
...,
response_model=UserDetail,
)
```
**"Using async clients"**
For async clients you must use apatch vs patch like so:
```py
import instructor
from openai import AsyncOpenAI
aclient = instructor.apatch(AsyncOpenAI())
class UserExtract(BaseModel):
name: str
age: int
model = await aclient.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserExtract,
messages=[
{"role": "user", "content": "Extract jason is 25 years old"},
],
)
assert isinstance(model, UserExtract)
```
## Installation
+37 -13
View File
@@ -2,11 +2,9 @@
_Structured extraction in Python, powered by OpenAI's function calling api, designed for simplicity, transparency, and control._
Built to interact solely with openai's function calling api from python. It's designed to be intuitive, easy to use, and provide great visibility into your prompts.
---
[Star us on Github!](https://jxnl.github.io/instructor)
[Star us on Github!](https://jxnl.github.io/instructor).
[![Buy Me a Coffee](https://img.shields.io/badge/Buy%20Me%20a%20Coffee-Donate-yellow)](https://www.buymeacoffee.com/jxnlco)
[![Downloads](https://img.shields.io/pypi/dm/instructor.svg)](https://pypi.python.org/pypi/instructor)
@@ -19,12 +17,12 @@ Built to interact solely with openai's function calling api from python. It's de
[![PyPI version](https://img.shields.io/pypi/v/instructor.svg)](https://pypi.python.org/pypi/instructor)
[![PyPI pyversions](https://img.shields.io/pypi/pyversions/instructor.svg)](https://pypi.python.org/pypi/instructor)
---
Built to interact solely with openai's function calling api from python. It's designed to be intuitive, easy to use, and provide great visibility into your prompts.
## Usage
```py hl_lines="5 13"
from openai import OpenAI()
from openai import OpenAI
import instructor
# Enables `response_model`
@@ -47,7 +45,7 @@ assert user.name == "Jason"
assert user.age == 25
```
!!! note "Using `openai<1.0.0`"
!!! warning "Using `openai<1.0.0`"
If you're using `openai<1.0.0` then make sure you `pip install instructor<0.3.0`
where you can patch a global client like so:
@@ -64,6 +62,31 @@ assert user.age == 25
)
```
!!! note "Using async clients"
For async clients you must use apatch vs patch like so:
```py
import instructor
from openai import AsyncOpenAI
aclient = instructor.apatch(AsyncOpenAI())
class UserExtract(BaseModel):
name: str
age: int
model = await aclient.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserExtract,
messages=[
{"role": "user", "content": "Extract jason is 25 years old"},
],
)
assert isinstance(model, UserExtract)
```
## Installation
To get started you need to install it using `pip`. Run the following command in your terminal:
@@ -89,9 +112,10 @@ The patch introduces 3 features to the `ChatCompletion` class:
First, import the required libraries and apply the patch function to the OpenAI module. This exposes new functionality with the response_model parameter.
```python hl_lines="6"
```python
import instructor
from openai import OpenAI
from pydantic import BaseModel
# This enables response_model keyword
# from client.chat.completions.create
@@ -115,7 +139,7 @@ class UserDetail(BaseModel):
Use the `client.chat.completions.create` method to send a prompt and extract the data into the Pydantic object. The response_model parameter specifies the Pydantic model to use for extraction. Its helpful to annotate the variable with the type of the response model.
which will help your IDE provide autocomplete and spell check.
```python hl_lines="3"
```python
user: UserDetail = client.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserDetail,
@@ -128,7 +152,7 @@ assert user.name == "Jason"
assert user.age == 25
```
## Advanced: Pydantic Validation
## Pydantic Validation
Validation can also be plugged into the same Pydantic model. Here, if the answer attribute contains content that violates the rule "don't say objectionable things," Pydantic will raise a validation error.
@@ -155,20 +179,20 @@ except ValidationError as e:
Its important to not here that the error message is generated by the LLM, not the code, so it'll be helpful for re asking the model.
```plaintext hl_lines="3"
```plaintext
1 validation error for QuestionAnswer
answer
Assertion failed, The statement is objectionable. (type=assertion_error)
```
## Advanced: Reask on validation error
## Reask on validation error
Here, the `UserDetails` model is passed as the `response_model`, and `max_retries` is set to 2.
```python hl_lines="15-18 22 23 29"
```python
from openai import OpenAI
import instructor
from openai import OpenAI
from pydantic import BaseModel, field_validator
# Apply the patch to the OpenAI client
+2 -1
View File
@@ -1,7 +1,7 @@
from .function_calls import OpenAISchema, openai_function, openai_schema
from .distil import FinetuneFormat, Instructions
from .dsl import MultiTask, Maybe, llm_validator, CitationMixin
from .patch import patch
from .patch import patch, apatch
__all__ = [
"OpenAISchema",
@@ -11,6 +11,7 @@ __all__ = [
"Maybe",
"openai_schema",
"patch",
"apatch",
"llm_validator",
"FinetuneFormat",
"Instructions",
+3 -5
View File
@@ -134,9 +134,7 @@ def retry_sync(
raise e
def wrap_chatcompletion(func: Callable) -> Callable:
is_async = inspect.iscoroutinefunction(func)
def wrap_chatcompletion(func: Callable, is_async: bool = None) -> Callable:
@wraps(func)
async def new_chatcompletion_async(
response_model=None,
@@ -211,7 +209,7 @@ def apatch(client):
- `validation_context` parameter to validate the response using the pydantic model
- `strict` parameter to use strict json parsing
"""
client.chat.completions.acreate = wrap_chatcompletion(
client.chat.completions.acreate
client.chat.completions.create = wrap_chatcompletion(
client.chat.completions.create, is_async=True
)
return client
+1 -1
View File
@@ -1,6 +1,6 @@
[tool.poetry]
name = "instructor"
version = "0.3.1"
version = "0.3.2"
description = "Helper functions that allow us to improve openai's function_call ergonomics"
authors = ["Jason Liu <jason@jxnl.co>"]
license = "MIT"
+24 -1
View File
@@ -1,9 +1,32 @@
import pytest
from pydantic import BaseModel
from openai import OpenAI
from openai import OpenAI, AsyncOpenAI
import instructor
client = instructor.patch(OpenAI())
aclient = instructor.apatch(AsyncOpenAI())
@pytest.mark.asyncio
async def test_async_runmodel():
class UserExtract(BaseModel):
name: str
age: int
model = await aclient.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserExtract,
messages=[
{"role": "user", "content": "Extract jason is 25 years old"},
],
)
assert isinstance(model, UserExtract), "Should be instance of UserExtract"
assert model.name.lower() == "jason"
assert hasattr(
model, "_raw_response"
), "The raw response should be available from OpenAI"
def test_runmodel():
+2 -2
View File
@@ -38,7 +38,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We have a `name` field, which is a string, and an `age` field, which is an integer. However, if we were to load this into a dictionary, we would have no way of knowing if the data is valid. For example, we could have a string for the age, or we could have a float for the age. We could also have a string for the name, or we could have a list for the name. We have no way of knowing if the data is valid, and we have no way of knowing if the data is valid."
"We have a `name` field, which is a string, and an `age` field, which is an integer. However, if we were to load this into a dictionary, we would have no way of knowing if the data is valid. For example, we could have a string for the age, or we could have a float for the age. We could also have a string for the name, or we could have a list for the name."
]
},
{
@@ -486,7 +486,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you can see that when we set response_model create call will now return a pydantic model, and we can use that to validate the data. and work with it as if it was a python object."
"Now you can see that when we set `response_model` create call will now return a pydantic model, and we can use that to validate the data. and work with it as if it was a python object."
]
}
],
+3 -8
View File
@@ -107,7 +107,7 @@
"source": [
"### Example 1) Improving Extractions\n",
"\n",
"One of the big limitations is that often times the query we embed and the text \n",
"One of the big limitations is that often times the query we embed and the text that we want to retrieve are not sufficiently close in the semantic space.\n",
"A common method of using structured output is to extract information from a document and use it to answer a question. Directly, we can be creative in how we extract, summarize and generate potential questions in order for our embeddings to do better. \n",
"\n",
"For example, instead of using just a text chunk we could try to:\n",
@@ -511,9 +511,9 @@
"source": [
"### Example 4) Decomposing questions \n",
"\n",
"Lastly, a lightly more complex example of a problem that can be solved with structured output is decomposing questions. Where you ultimately want to decompose a question into a series of sub-questions that can be answered by a search backend. For example \n",
"Lastly, a lightly more complex example of a problem that can be solved with structured output is decomposing questions. Where you ultimately want to decompose a question into a series of sub-questions that can be answered by a search backend. For example:\n",
"\n",
"\"Whats the difference in populations of jason's home country and canadata?\"\n",
"\"Whats the difference in populations of jason's home country and canada?\"\n",
"\n",
"You'd ultimately need to know a few things\n",
"\n",
@@ -525,11 +525,6 @@
"This would not be done correctly as a single query, nor would it be done in parallel, however there are some opportunities try to be parallel since not all of the sub-questions are dependent on each other."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": 35,