mirror of
https://github.com/kennethreitz/instructor.git
synced 2026-06-05 22:50:18 +00:00
824a7b5390
* implement maybe * dump
219 lines
7.2 KiB
Markdown
219 lines
7.2 KiB
Markdown
# Prompt Engineering for Function Calling
|
|
|
|
The overarching theme of using instructor and pydantic for function calling is to make the models as self-descriptive, modular, and flexible as possible, while maintaining data integrity and ease of use.
|
|
|
|
- **Modularity**: Design self-contained components for reuse.
|
|
- **Self-Description**: Use Pydantic's `Field` for clear field descriptions.
|
|
- **Optionality**: Use Python's `Optional` type for nullable fields and set sensible defaults.
|
|
- **Standardization**: Employ enumerations for fields with a fixed set of values; include a fallback option.
|
|
- **Dynamic Data**: Use key-value pairs for arbitrary properties and limit list lengths.
|
|
- **Entity Relationships**: Define explicit identifiers and relationship fields.
|
|
- **Contextual Logic**: Optionally add a "chain of thought" field in reusable components for extra context.
|
|
|
|
## Modular Chain of Thought
|
|
|
|
This approach to "chain of thought" improves data quality but can have modular components rather than global CoT.
|
|
|
|
```python hl_lines="4 5"
|
|
from pydantic import BaseModel, Field
|
|
|
|
class Role(BaseModel):
|
|
chain_of_thought: str = Field(...,
|
|
description="Think step by step to determine the correct title")
|
|
title: str
|
|
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
role: Role
|
|
```
|
|
|
|
## Utilize Optional Attributes
|
|
|
|
Use Python's Optional type and set a default value to prevent undesired defaults like empty strings.
|
|
|
|
```python hl_lines="6"
|
|
from typing import Optional
|
|
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
role: Optional[str] = Field(default=None)
|
|
|
|
```
|
|
|
|
## Handling Errors Within Function Calls
|
|
|
|
You can create a wrapper class to hold either the result of an operation or an error message. This allows you to remain within a function call even if an error occurs, facilitating better error handling without breaking the code flow.
|
|
|
|
```python
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
role: Optional[str] = Field(default=None)
|
|
|
|
class MaybeUser(BaseModel):
|
|
result: Optional[UserDetail] = Field(default=None)
|
|
error: bool = Field(default=False)
|
|
message: Optional[str]
|
|
|
|
def __bool__(self):
|
|
return self.result is not None
|
|
```
|
|
|
|
With the `MaybeUser` class, you can either receive a `UserDetail` object in result or get an error message in message.
|
|
|
|
### Simplification with the Maybe Pattern
|
|
|
|
You can further simplify this using instructor to create the `Maybe` pattern dynamically from any `BaseModel`.
|
|
|
|
```python
|
|
import instructor
|
|
|
|
MaybeUser = instructor.Maybe(UserDetail)
|
|
```
|
|
|
|
This allows you to quickly create a Maybe type for any class, streamlining the process.
|
|
|
|
## Tips for Enumerations
|
|
|
|
To prevent data misalignment, use Enums for standardized fields. Always include an "Other" option as a fallback so the model can signal uncertainty.
|
|
|
|
```python hl_lines="7 12"
|
|
from enum import Enum, auto
|
|
|
|
class Role(Enum):
|
|
PRINCIPAL = auto()
|
|
TEACHER = auto()
|
|
STUDENT = auto()
|
|
OTHER = auto()
|
|
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
role: Role = Field(description="Correctly assign one of the predefined roles to the user.")
|
|
|
|
```
|
|
|
|
## Reiterate Long Instructions
|
|
|
|
For complex attributes, it helps to reiterate the instructions in the field's description.
|
|
|
|
```python hl_lines="5 11"
|
|
class Role(BaseModel):
|
|
"""
|
|
Extract the role based on the following rules ...
|
|
"""
|
|
instructions: str = Field(..., description="Restate the instructions and rules to correctly determine the title.")
|
|
title: str
|
|
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
role: Role
|
|
|
|
```
|
|
|
|
## Handle Arbitrary Properties
|
|
|
|
When you need to extract undefined attributes, use a list of key-value pairs.
|
|
|
|
```python hl_lines="10"
|
|
from typing import List
|
|
|
|
class Property(BaseModel):
|
|
key: str
|
|
value: str
|
|
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
properties: List[Property] = Field(..., description="Extract any other properties that might be relevant.")
|
|
|
|
```
|
|
|
|
## Limiting the Length of Lists
|
|
|
|
When dealing with lists of attributes, especially arbitrary properties, it's crucial to manage the length. You can use prompting and enumeration to limit the list length, ensuring a manageable set of properties.
|
|
|
|
```python hl_lines="2 9"
|
|
class Property(BaseModel):
|
|
index: str = Field(..., description="Monotonically increasing ID")
|
|
key: str
|
|
value: str
|
|
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
properties: List[Property] = Field(..., description="Numbered list of arbitrary extracted properties, should be less than 6")
|
|
```
|
|
|
|
**Using Tuples for Simple Types**
|
|
|
|
For simple types, tuples can be a more compact alternative to custom classes, especially when the properties don't require additional descriptions.
|
|
|
|
```python hl_lines="4"
|
|
class UserDetail(BaseModel):
|
|
age: int
|
|
name: str
|
|
properties: List[Tuple[int, str]] = Field(..., description="Numbered list of arbitrary extracted properties, should be less than 6")
|
|
```
|
|
|
|
## Advanced Arbitrary Properties
|
|
|
|
For multiple users, aim to use consistent key names when extracting properties.
|
|
|
|
```python
|
|
class UserDetails(BaseModel):
|
|
"""
|
|
Extract information for multiple users.
|
|
Use consistent key names for properties across users.
|
|
"""
|
|
users: List[UserDetail]
|
|
|
|
```
|
|
|
|
This refined guide should offer a cleaner and more organized approach to structure engineering in Python.
|
|
|
|
## Defining Relationships Between Entities
|
|
|
|
In cases where relationships exist between entities, it's vital to define them explicitly in the model. The following example demonstrates how to define relationships between users by incorporating an id and a friends field:
|
|
|
|
```python hl_lines="2 5 8"
|
|
class UserDetail(BaseModel):
|
|
id: int = Field(..., description="Unique identifier for each user.")
|
|
age: int
|
|
name: str
|
|
friends: List[int] = Field(..., description="Correct and complete list of friend IDs, representing relationships between users.")
|
|
|
|
class UserRelationships(BaseModel):
|
|
users: List[UserDetail] = Field(..., description="Collection of users, correctly capturing the relationships among them.")
|
|
```
|
|
|
|
## Reusing Components with Different Contexts
|
|
|
|
You can reuse the same component for different contexts within a model. In this example, the TimeRange component is used for both work_time and leisure_time.
|
|
|
|
```python hl_lines="9 10"
|
|
class TimeRange(BaseModel):
|
|
start_time: int = Field(..., description="The start time in hours.")
|
|
end_time: int = Field(..., description="The end time in hours.")
|
|
|
|
class UserDetail(BaseModel):
|
|
id: int = Field(..., description="Unique identifier for each user.")
|
|
age: int
|
|
name: str
|
|
work_time: TimeRange = Field(..., description="Time range during which the user is working.")
|
|
leisure_time: TimeRange = Field(..., description="Time range reserved for leisure activities.")
|
|
```
|
|
|
|
Sometimes, a component like TimeRange may require some context or additional logic to be used effectively. Employing a "chain of thought" field within the component can help in understanding or optimizing the time range allocations.
|
|
|
|
```python hl_lines="2"
|
|
class TimeRange(BaseModel):
|
|
chain_of_thought: str = Field(..., description="Step by step reasoning to get the correct time range")
|
|
start_time: int = Field(..., description="The start time in hours.")
|
|
end_time: int = Field(..., description="The end time in hours.")
|
|
```
|
|
|