update docs

This commit is contained in:
Jason Liu
2023-11-20 18:58:57 -05:00
parent e9ce94f743
commit df8cdeadd1
11 changed files with 362 additions and 13 deletions
+3
View File
@@ -0,0 +1,3 @@
!!! warning "This page is a work in progress"
This page is a work in progress. Check out [Pydantic's documentation](https://docs.pydantic.dev/latest/concepts/alias/)
+25
View File
@@ -0,0 +1,25 @@
To prevent data misalignment, we can use Enums for standardized fields. Always include an "Other" option as a fallback so the model can signal uncertainty.
```python hl_lines="7 12"
from enum import Enum, auto
class Role(Enum):
PRINCIPAL = "PRINCIPAL"
TEACHER = "TEACHER"
STUDENT = "STUDENT"
OTHER = "OTHER""
class UserDetail(BaseModel):
age: int
name: str
role: Role = Field(description="Correctly assign one of the predefined roles to the user.")
```
If you're having a hard time with `Enum` and alternative is to use `Literal` instead.
```python hl_lines="4"
class UserDetail(BaseModel):
age: int
name: str
role: Literal["PRINCIPAL", "TEACHER", "STUDENT", "OTHER"]
```
+160
View File
@@ -0,0 +1,160 @@
The `pydantic.Field` function is used to customize and add metadata to fields of models. To learn more check out the pydantic [documentation](https://docs.pydantic.dev/latest/concepts/fields/) as this is a near replica of that documentation that is relevant to prompting.
## Default values
The `default` parameter is used to define a default value for a field.
```py
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(default='John Doe')
user = User()
print(user)
#> name='John Doe'
```
You can also use `default_factory` to define a callable that will be called to generate a default value.
```py
from uuid import uuid4
from pydantic import BaseModel, Field
class User(BaseModel):
id: str = Field(default_factory=lambda: uuid4().hex)
```
!!! info
The `default` and `default_factory` parameters are mutually exclusive.
!!! note
If you use `typing.Optional`, it doesn't mean that the field has a default value of `None` you must use `default` or `default_factory` to define a default value. Then it will be considered `not required` when sent to the language model.
## Using `Annotated`
The `Field` function can also be used together with `Annotated`.
```py
from uuid import uuid4
from typing_extensions import Annotated
from pydantic import BaseModel, Field
class User(BaseModel):
id: Annotated[str, Field(default_factory=lambda: uuid4().hex)]
```
## Exclude
The `exclude` parameter can be used to control which fields should be excluded from the
model when exporting the model. This is helpful when you want to exclude fields that are not relevant to the model
generation like `scratch_pad` or `chain_of_thought`
See the following example:
```py
from pydantic import BaseModel, Field
from datetime import date
class DateRange(BaseModel):
chain_of_thought: str = Field(
description="Reasoning behind the date range."
exclude=True)
start_date: date
end_date: date
date_range = DateRange(
chain_of_thought="""
I want to find the date range for the last 30 days.
Today is 2021-01-30 therefore the start date
should be 2021-01-01 and the end date is 2021-01-30""",
start_date=date(2021, 1, 1),
end_date=date(2021, 1, 30),
)
print(date_range.model_dump_json())
#> start_date=datetime.date(2021, 1, 1) end_date=datetime.date(2021, 1, 30)
```
## Customizing JSON Schema
There are fields that exclusively to customise the generated JSON Schema:
- `title`: The title of the field.
- `description`: The description of the field.
- `examples`: The examples of the field.
- `json_schema_extra`: Extra JSON Schema properties to be added to the field.
These all work as great opportunities to add more information to the JSON Schema as part
of your prompt engineering.
Here's an example:
```py
from pydantic import BaseModel, EmailStr, Field, SecretStr
class User(BaseModel):
age: int = Field(description='Age of the user')
email: EmailStr = Field(examples=['marcelo@mail.com'])
name: str = Field(title='Username')
password: SecretStr = Field(
json_schema_extra={
'title': 'Password',
'description': 'Password of the user',
'examples': ['123456'],
}
)
print(User.model_json_schema())
"""
{
'properties': {
'age': {
'description': 'Age of the user',
'title': 'Age',
'type': 'integer',
},
'email': {
'examples': ['marcelo@mail.com'],
'format': 'email',
'title': 'Email',
'type': 'string',
},
'name': {'title': 'Username', 'type': 'string'},
'password': {
'description': 'Password of the user',
'examples': ['123456'],
'format': 'password',
'title': 'Password',
'type': 'string',
'writeOnly': True,
},
},
'required': ['age', 'email', 'name', 'password'],
'title': 'User',
'type': 'object',
}
"""
```
## General notes on JSON schema generation
- The JSON schema for Optional fields indicates that the value null is allowed.
- The Decimal type is exposed in JSON schema (and serialized) as a string.
- The JSON schema does not preserve namedtuples as namedtuples.
- When they differ, you can specify whether you want the JSON schema to represent the inputs to validation or the outputs from serialization.
- Sub-models used are added to the `$defs` JSON attribute and referenced, as per the spec.
- Sub-models with modifications (via the Field class) like a custom title, description, or default value, are recursively included instead of referenced.
- The description for models is taken from either the docstring of the class or the argument description to the Field class.
+4 -10
View File
@@ -1,12 +1,6 @@
# Handling Missing Data with `Maybe`
# Handling Missing Data
In this post, we will demonstrate how to use the `Maybe` pattern to manage missing data and employ pattern matching to handle errors in a structured manner.
## What is `Maybe`?
The `Maybe` pattern is a concept in functional programming used for error handling. Instead of raising exceptions or returning `None`, you can use a `Maybe` type to encapsulate both the result and potential errors. This pattern is particularly useful when making OpenAI API calls, as providing language models with an escape mechanism effectively reduces hallucinations. Consequently, we can construct a prompt that closely resembles regular programming.
Towards the end, we will demonstrate how to use `Maybe` instances in pattern matching, which offers an excellent approach for handling errors in a structured manner.
The `Maybe` pattern is a concept in functional programming used for error handling. Instead of raising exceptions or returning `None`, you can use a `Maybe` type to encapsulate both the result and potential errors. This pattern is particularly useful when making llm calls, as providing language models with an escape hatch can effectively reduce hallucinations.
## Defining the Model
@@ -76,7 +70,7 @@ user2 = extract("Unknown user")
As you can see, when the data is extracted successfully, the `result` field contains the `UserDetail` instance. When an error occurs, the `error` field is set to `True`, and the `message` field contains the error message.
## Handle the result
## Handling the result
There are a few ways we can handle the result. Normally, we can just access the individual fields.
@@ -89,7 +83,7 @@ def process_user_detail(maybe_user: MaybeUser):
print(f"Not found: {user1.message}")
```
## Pattern Matching
### Pattern Matching
We can also use pattern matching to handle the result. This is a great way to handle errors in a structured way.
+150
View File
@@ -0,0 +1,150 @@
# Response Model
Defining llm output schemas in Pydantic is done via `pydantic.BaseModel`. To learn more about models in pydantic checkout their [documentation](https://docs.pydantic.dev/latest/concepts/models/).
After defining a pydantic model, we can use it as as the `response_model` in your client `create` calls to openai. The job of the `response_model` is to define the schema and prompts for the language model and validate the response from the API and return a pydantic model instance.
## Prompting
When defining a response model, we can use docstrings and field annotations to define the prompt that will be used to generate the response.
```python
from pydantic import BaseModel, Field
class User(BaseModel):
"""
This is the prompt that will be used to generate the response.
Any instructions here will be passed to the language model.
"""
name: str = Field(description="The name of the user.")
age: int = Field(description="The age of the user.")
```
Here all docstrings, types, and field annotations will be used to generate the prompt. The prompt will be generated by the `create` method of the client and will be used to generate the response.
## Optional Values
If we use `Optional` and `default` they will be considered not required when sent to the language model
```python
class User(BaseModel):
name: str = Field(description="The name of the user.")
age: int = Field(description="The age of the user.")
email: Optional[str] = Field(description="The email of the user.", default=None)
```
## Dynamic model creation
There are some occasions where it is desirable to create a model using runtime information to specify the fields. For this Pydantic provides the create_model function to allow models to be created on the fly:
```python
from pydantic import BaseModel, create_model
class FooModel(BaseModel):
foo: str
bar: int = 123
BarModel = create_model(
'BarModel',
apple=(str, 'russet'),
banana=(str, 'yellow'),
__base__=FooModel,
)
print(BarModel)
#> <class '__main__.BarModel'>
print(BarModel.model_fields.keys())
#> dict_keys(['foo', 'bar', 'apple', 'banana'])
```
??? notes "When would I use this?"
Consider a situation where the model is dynamically defined, based on some configuration or database. For example, we could have a database table that stores the properties of a model for
some model name or id. We could then query the database for the properties of the model and use that to create the model.
```sql
SELECT property_name, property_type, description
FROM prompt
WHERE model_name = {model_name}
```
We can then use this information to create the model.
```python
types = {
'string': str,
'integer': int,
'boolean': bool,
'number': float,
'List[str]': List[str],
}
BarModel = create_model(
'User',
**{
property_name: (types[property_type], description)
for property_name, property_type, description in cursor.fetchall()
},
__base__=BaseModel,
)
```
This would be useful when different users have different descriptions for the same model. We can use the same model but have different prompts for each user.
## Structural Pattern Matching
Pydantic supports structural pattern matching for models, as introduced by PEP 636 in Python 3.10.
```python
from pydantic import BaseModel
class Pet(BaseModel):
name: str
species: str
a = Pet(name='Bones', species='dog')
match a:
# match `species` to 'dog', declare and initialize `dog_name`
case Pet(species='dog', name=dog_name):
print(f'{dog_name} is a dog')
#> Bones is a dog
# default case
case _:
print('No dog matched')
```
## Adding Behavior
We can add methods to our pydantic models just as any plain python class. We might want to do this to add some custom logic to our models.
```python
from pydantic import BaseModel
from typing import Literal
from openai import OpenAI
import instructor
client = instructor.patch(OpenAI())
class SearchQuery(BaseModel):
query: str
query_type: Literal["web", "image", "video"]
def execute(self):
# do some logic here
return results
query = client.chat.completions.create(
..., response_model=SearchQuery
)
results = query.execute()
```
Now we can call `execute` on our model instance after extracting it from a language model. If you want to see more examples of this checkout our post on [RAG is more than embeddings](../blog/posts/rag-and-beyond.md)
+2
View File
@@ -1,3 +1,5 @@
# General Tips for Prompt Engineering
The overarching theme of using Instructor and Pydantic for function calling is to make the models as self-descriptive, modular, and flexible as possible, while maintaining data integrity and ease of use.
- **Modularity**: Design self-contained components for reuse.
+3
View File
@@ -0,0 +1,3 @@
!!! warning "This page is a work in progress"
This page is a work in progress. Check out [Pydantic's documentation](https://docs.pydantic.dev/latest/concepts/type_adapter/)
+3
View File
@@ -0,0 +1,3 @@
!!! warning "This page is a work in progress"
This page is a work in progress. Check out [Pydantic's documentation](https://docs.pydantic.dev/latest/concepts/types/)
+3
View File
@@ -0,0 +1,3 @@
!!! warning "This page is a work in progress"
This page is a work in progress. Check out [Pydantic's documentation](https://docs.pydantic.dev/latest/concepts/union/)
+9 -3
View File
@@ -126,11 +126,17 @@ nav:
- Help with Instructor: 'help.md'
- Installation: 'installation.md'
- Contributing: 'contributing.md'
- Tips: 'concepts/prompting.md'
- Concepts:
- Schema Engineering: 'concepts/prompting.md'
- Lists: "concepts/multitask.md"
- Missing Content: "concepts/maybe.md"
- Models: 'concepts/models.md'
- Fields: 'concepts/fields.md'
- Types: 'concepts/types.md'
- Streaming: "concepts/lists.md"
- Union: 'concepts/union.md'
- Alias: 'concepts/alias.md'
- Type Adapter: 'concepts/typeadapter.md'
- Validators: "concepts/reask_validation.md"
- Missing: "concepts/maybe.md"
- Distillation: "concepts/distillation.md"
- Philosophy: 'concepts/philosophy.md'
- Cookbook: