diff --git a/docs/examples/search.md b/docs/examples/search.md index da97e3e..45b6e18 100644 --- a/docs/examples/search.md +++ b/docs/examples/search.md @@ -6,79 +6,34 @@ In this example, we will demonstrate how to leverage the `MultiTask` and `enum.E Extracting a list of tasks from text is a common use case for leveraging language models. This pattern can be applied to various applications, such as virtual assistants like Siri or Alexa, where understanding user intent and breaking down requests into actionable tasks is crucial. In this example, we will demonstrate how to use OpenAI Function Call to segment search queries and execute them in parallel. -## Defining the Structures +## Structure of the Data -Let's model the problem as breaking down a search request into a list of search queries. We will use an enum to represent different types of searches and take advantage of Python objects to add additional query logic. +The `Search` class is a Pydantic model that defines the structure of the search query. It has three fields: `title`, `query`, and `type`. The `title` field is the title of the request, the `query` field is the query to search for relevant content, and the `type` field is the type of search. The `execute` method is used to execute the search query. -```python -import enum -from pydantic import Field - - -class SearchType(str, enum.Enum): - """Enumeration representing the types of searches that can be performed.""" - - VIDEO = "video" - EMAIL = "email" - - -class Search(BaseModel): - """ - Class representing a single search query. - """ - - title: str = Field(..., description="Title of the request") - query: str = Field(..., description="Query to search for relevant content") - type: SearchType = Field(..., description="Type of search") - - async def execute(self): - print( - f"Searching for `{self.title}` with query `{self.query}` using `{self.type}`" - ) -``` - -Notice that we have added the `execute` method to the `Search` class. This method can be used to route the search query based on the enum type. You can add logic specific to each search type in the `execute` method. - -Next, let's define a class to represent multiple search queries. - -```python -from typing import List - - -class MultiSearch(BaseModel): - "Correctly segmented set of search results" - tasks: List[Search] -``` - -The `MultiSearch` class has a single attribute, `tasks`, which is a list of `Search` objects. - -This pattern is so common that we've added a helper function `MultiTask` to makes this simpler - -```python -from instructor.dsl import MultiTask - -MultiSearch = MultiTask(Search) -``` - -## Calling Completions - -To segment a search query, we will use the base openai api. We can define a function that takes a string and returns segmented search queries using the `MultiSearch` class. - -```python hl_lines="7 8" +```python hl_line import instructor from openai import OpenAI +from typing import Iterable +from pydantic import BaseModel, Field # Apply the patch to the OpenAI client # enables response_model keyword client = instructor.patch(OpenAI()) +class Search(BaseModel): + query: str = Field(..., description="Query to search for relevant content") + type: Literal["web", "image", "video"] = Field(..., description="Type of search") + + async def execute(self): + print( + f"Searching for `{self.title}` with query `{self.query}` using `{self.type}`" + ) + def segment(data: str) -> MultiSearch: return client.chat.completions.create( model="gpt-3.5-turbo-0613", - temperature=0.1, - functions=[MultiSearch.openai_schema], - function_call={"name": MultiSearch.openai_schema["name"]}, + response_model=Iterable[Search], messages=[ { "role": "user", @@ -87,36 +42,18 @@ def segment(data: str) -> MultiSearch: ], max_tokens=1000, ) -``` - -The `segment` function takes a string `data` and creates a completion. It prompts the model to segment the data into multiple search queries and returns the result as a `MultiSearch` object. - -## Evaluating an Example - -Let's evaluate an example by segmenting a search query and executing the segmented queries. - -```python -import asyncio - -queries = segment( - "Please send me the video from last week about the investment case study and also documents about your GDPR policy?" -) - - -async def execute_queries(queries: Multisearch): - await asyncio.gather(*[q.execute() for q in queries.tasks]) - - -loop = asyncio.get_event_loop() -loop.run_until_complete(execute_queries()) -loop.close() -``` - -In this example, we use the `segment` function to segment the search query. We then use `asyncio` to asynchronously execute the queries using the `execute` method defined in the `Search` class. - -The output will be: - -``` -Searching for `Please send me the video from last week about the investment case study` with query `Please send me the video from last week about the investment case study` using `SearchType.VIDEO` -Searching for `also documents about your GDPR policy?` with query `also documents about your GDPR policy?` using `SearchType.EMAIL` + +for search in segment("Search for a picture of a cat and a video of a dog"): + print(search.model_dump_json()) + """ + { + "query": "a picture of a cat", + "type": "image" + } + { + "query": "a video of a dog", + "type": "video" + } + """ + } ```