Opensource examples - Runpod w/Text-Generation-WebUI API Endpoint (#247)

Co-authored-by: Jason Liu <jason@jxnl.co>
This commit is contained in:
Brandon Phillips
2023-12-04 21:28:50 -07:00
committed by GitHub
parent 8d3c255240
commit cb96010bba
5 changed files with 90 additions and 2 deletions
+12
View File
@@ -0,0 +1,12 @@
# Instructor with open source models
Instructor works with Open source model providers that support the [OpenAI API chat endpoint](https://platform.openai.com/docs/api-reference/chat)
See examples README [here](https://github.com/jxnl/instructor/tree/main/examples/open_source_examples)
# Currently tested open source model providers
- [OpenRouter](https://openrouter.ai/)
- [Perplexity](https://www.perplexity.ai/)
- [RunPod TheBloke LLMs](https://github.com/TheBlokeAI/dockerLLM/blob/main/README_Runpod_LocalLLMsUI.md) **
** This utilizes text-generation-webui w/ Openai plugin under the hood.
+14 -1
View File
@@ -11,4 +11,17 @@
1. Sign up for an Openrouter Account - https://www.perplexity.ai/
2. Create an API key - https://www.perplexity.ai/pplx-api
3. Add API key to environment - `export PERPLEXITY_API_KEY=your key here`
4. Add Openrouter API endpoint to environment - `export PERPLEXITY_BASE_URL=https://api.perplexity.ai` [See https://docs.perplexity.ai/reference/post_chat_completions for potential updates]
4. Add Openrouter API endpoint to environment - `export PERPLEXITY_BASE_URL=https://api.perplexity.ai` [See https://docs.perplexity.ai/reference/post_chat_completions for potential updates]
## Runpod
1. Sign up for a Runpod account - https://www.runpod.io/console/signup
2. Add credits, unfortunately no free tier. - https://www.runpod.io/console/user/billing
3. Navigate to templates page[Left selection menu], under `Official` click deploy on `RunPod TheBloke LLMs` template. - https://www.runpod.io/console/templates
4. Navigate to Community Cloud page [Left Selection menu], Click `Deploy` on a GPU with >=16 GB, 1x RTX 4000 Ada SFF works. - https://www.runpod.io/console/gpu-cloud
5. Click `Customize Deployment`, click the `Environment Variables` drop down, Enter the following Key/Values, then click `Set Overrides`, then click `Continue`, and finally `Deploy`.
- key=MODEL value=TheBloke/OpenHermes-2.5-Mistral-7B-GPTQ
- key=UI_ARGS value=--n-gpu-layers 100 --threads 1
6. Navigate to Pods[Left selection menu], wait until you see `Connect` button on the Pod you just deployed, click it. Right click `HTTP Service[Port 5000]` and copy the link address. - https://www.runpod.io/console/pods
- Add Runpod API endpoint to environment - `export RUNPOD_BASE_URL=your-runpod-link/v1` <-- Make sure to add v1 as well
- Add Runpod API key to environment - `export RUNPOD_API_KEY="None"` <-- This should be none.
7. When done running, stop instance by clicking the stop icon on the Pod page. - https://www.runpod.io/console/pods
+1 -1
View File
@@ -9,7 +9,7 @@ from instructor import Maybe, Mode
openrouter_api_key = os.environ.get("OPENROUTER_API_KEY")
assert openrouter_api_key, "OPENROUTER_API_KEY is not set in environment variables"
# Base URL for OpenAI
# Base URL for OpenAI client
openrouter_base_url = os.environ.get("OPENROUTER_BASE_URL")
assert openrouter_base_url, "OPENROUTER_BASE_URL is not set in environment variables"
+62
View File
@@ -0,0 +1,62 @@
import os
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import Optional
from instructor import Mode
# Extract API key from environment
runpod_api_key = os.environ.get("RUNPOD_API_KEY")
assert runpod_api_key, "RUNPOD_API_KEY is not set in environment variables"
# Base URL for OpenAI client
runpod_base_url = os.environ.get("RUNPOD_BASE_URL")
assert runpod_base_url, "RUNPOD_BASE_URL is not set in environment variables"
# Initialize OpenAI client
client = instructor.patch(
OpenAI(api_key=runpod_api_key, base_url=runpod_base_url),
mode=Mode.JSON,
)
data = [
"Brandon is 33 years old. He works as a solution architect.",
"Jason is 25 years old. He is the GOAT.",
"Dominic is 45 years old. He is retired.",
"Jenny is 72. She is a wife and a CEO.",
"Holly is 22. She is an explorer.",
"There onces was a prince, named Benny. He ruled for 10 years, which just ended. He started at 22.",
"Simon says, why are you 22 years old marvin?",
]
if __name__ == "__main__":
class UserDetail(BaseModel):
name: str = Field(description="Name extracted from the text")
age: int = Field(description="Age extracted from the text")
occupation: Optional[str] = Field(
default=None, description="Occupation extracted from the text"
)
for content in data:
try:
user = client.chat.completions.create(
response_model=UserDetail,
model="TheBloke_OpenHermes-2.5-Mistral-7B-GPTQ",
messages=[
{
"role": "system",
"content": "You are an expert at outputting json. You output valid JSON.",
},
{
"role": "user",
"content": f"Extract the user details from the following text: {content}. Match your response to the following schema: {UserDetail.model_json_schema()}",
},
],
)
print(f"Result: {user}")
except Exception as e:
print(f"Error: {e}")
continue
+1
View File
@@ -156,6 +156,7 @@ nav:
- Action Item and Dependency Mapping: 'examples/action_items.md'
- Multi-File Code Generation: 'examples/gpt-engineer.md'
- PII Data Sanitization: 'examples/pii.md'
- Open Source: 'examples/open_source.md'
- CLI Reference:
- "Introduction": "cli/index.md"
- "Finetuning GPT-3.5": "cli/finetune.md"