Opensource examples - Runpod w/Text-Generation-WebUI API Endpoint (#247)

Co-authored-by: Jason Liu <jason@jxnl.co>
2026-06-05 22:50:18 +00:00 · 2023-12-04 21:28:50 -07:00
parent 8d3c255240
commit cb96010bba
5 changed files with 90 additions and 2 deletions
@@ -0,0 +1,12 @@
+# Instructor with open source models
+Instructor works with Open source model providers that support the [OpenAI API chat endpoint](https://platform.openai.com/docs/api-reference/chat)
+
+See examples README [here](https://github.com/jxnl/instructor/tree/main/examples/open_source_examples)
+
+# Currently tested open source model providers
+- [OpenRouter](https://openrouter.ai/)
+- [Perplexity](https://www.perplexity.ai/)
+- [RunPod TheBloke LLMs](https://github.com/TheBlokeAI/dockerLLM/blob/main/README_Runpod_LocalLLMsUI.md) **
+
+
+** This utilizes text-generation-webui w/ Openai plugin under the hood. 
@@ -11,4 +11,17 @@
 1. Sign up for an Openrouter Account - https://www.perplexity.ai/
 2. Create an API key - https://www.perplexity.ai/pplx-api
 3. Add API key to environment - `export PERPLEXITY_API_KEY=your key here`
-4. Add Openrouter API endpoint to environment - `export PERPLEXITY_BASE_URL=https://api.perplexity.ai` [See https://docs.perplexity.ai/reference/post_chat_completions for potential updates]
+4. Add Openrouter API endpoint to environment - `export PERPLEXITY_BASE_URL=https://api.perplexity.ai` [See https://docs.perplexity.ai/reference/post_chat_completions for potential updates]
+
+## Runpod
+1. Sign up for a Runpod account - https://www.runpod.io/console/signup
+2. Add credits, unfortunately no free tier. - https://www.runpod.io/console/user/billing
+3. Navigate to templates page[Left selection menu], under `Official` click deploy on `RunPod TheBloke LLMs` template. - https://www.runpod.io/console/templates
+4. Navigate to Community Cloud page [Left Selection menu], Click `Deploy` on a GPU with >=16 GB, 1x RTX 4000 Ada SFF works. - https://www.runpod.io/console/gpu-cloud
+5. Click `Customize Deployment`, click the `Environment Variables` drop down, Enter the following Key/Values, then click `Set Overrides`, then click `Continue`, and finally `Deploy`.
+    - key=MODEL value=TheBloke/OpenHermes-2.5-Mistral-7B-GPTQ
+    - key=UI_ARGS value=--n-gpu-layers 100 --threads 1
+6. Navigate to Pods[Left selection menu], wait until you see `Connect` button on the Pod you just deployed, click it. Right click `HTTP Service[Port 5000]` and copy the link address. - https://www.runpod.io/console/pods
+    - Add Runpod API endpoint to environment - `export RUNPOD_BASE_URL=your-runpod-link/v1` <-- Make sure to add v1 as well
+    - Add Runpod API key to environment -  `export RUNPOD_API_KEY="None"` <-- This should be none.
+7. When done running, stop instance by clicking the stop icon on the Pod page. - https://www.runpod.io/console/pods
@@ -9,7 +9,7 @@ from instructor import Maybe, Mode
 openrouter_api_key = os.environ.get("OPENROUTER_API_KEY")
 assert openrouter_api_key, "OPENROUTER_API_KEY is not set in environment variables"

-# Base URL for OpenAI
+# Base URL for OpenAI client
 openrouter_base_url = os.environ.get("OPENROUTER_BASE_URL")
 assert openrouter_base_url, "OPENROUTER_BASE_URL is not set in environment variables"

@@ -0,0 +1,62 @@
+import os
+import instructor
+from openai import OpenAI
+from pydantic import BaseModel, Field
+from typing import Optional
+from instructor import Mode
+
+# Extract API key from environment
+runpod_api_key = os.environ.get("RUNPOD_API_KEY")
+assert runpod_api_key, "RUNPOD_API_KEY is not set in environment variables"
+
+# Base URL for OpenAI client
+runpod_base_url = os.environ.get("RUNPOD_BASE_URL")
+assert runpod_base_url, "RUNPOD_BASE_URL is not set in environment variables"
+
+# Initialize OpenAI client
+client = instructor.patch(
+    OpenAI(api_key=runpod_api_key, base_url=runpod_base_url),
+    mode=Mode.JSON,
+)
+
+
+data = [
+    "Brandon is 33 years old. He works as a solution architect.",
+    "Jason is 25 years old. He is the GOAT.",
+    "Dominic is 45 years old. He is retired.",
+    "Jenny is 72. She is a wife and a CEO.",
+    "Holly is 22. She is an explorer.",
+    "There onces was a prince, named Benny. He ruled for 10 years, which just ended. He started at 22.",
+    "Simon says, why are you 22 years old marvin?",
+]
+
+
+if __name__ == "__main__":
+
+    class UserDetail(BaseModel):
+        name: str = Field(description="Name extracted from the text")
+        age: int = Field(description="Age extracted from the text")
+        occupation: Optional[str] = Field(
+            default=None, description="Occupation extracted from the text"
+        )
+
+    for content in data:
+        try:
+            user = client.chat.completions.create(
+                response_model=UserDetail,
+                model="TheBloke_OpenHermes-2.5-Mistral-7B-GPTQ",
+                messages=[
+                    {
+                        "role": "system",
+                        "content": "You are an expert at outputting json. You output valid JSON.",
+                    },
+                    {
+                        "role": "user",
+                        "content": f"Extract the user details from the following text: {content}. Match your response to the following schema: {UserDetail.model_json_schema()}",
+                    },
+                ],
+            )
+            print(f"Result: {user}")
+        except Exception as e:
+            print(f"Error: {e}")
+            continue
@@ -156,6 +156,7 @@ nav:
    - Action Item and Dependency Mapping: 'examples/action_items.md'
    - Multi-File Code Generation: 'examples/gpt-engineer.md'
    - PII Data Sanitization: 'examples/pii.md'
+    - Open Source: 'examples/open_source.md'
  - CLI Reference:
      - "Introduction": "cli/index.md"
      - "Finetuning GPT-3.5": "cli/finetune.md"