mv module integrations docs (#8101)

This commit is contained in:
Bagatur
2023-07-23 23:23:16 -07:00
committed by GitHub
parent 8ea840432f
commit c8c8635dc9
619 changed files with 2322 additions and 449 deletions
File diff suppressed because one or more lines are too long
@@ -0,0 +1,16 @@
# AI21 Labs
This page covers how to use the AI21 ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific AI21 wrappers.
## Installation and Setup
- Get an AI21 api key and set it as an environment variable (`AI21_API_KEY`)
## Wrappers
### LLM
There exists an AI21 LLM wrapper, which you can access with
```python
from langchain.llms import AI21
```
@@ -0,0 +1,311 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Aim\n",
"\n",
"Aim makes it super easy to visualize and debug LangChain executions. Aim tracks inputs and outputs of LLMs and tools, as well as actions of agents. \n",
"\n",
"With Aim, you can easily debug and examine an individual execution:\n",
"\n",
"![](https://user-images.githubusercontent.com/13848158/227784778-06b806c7-74a1-4d15-ab85-9ece09b458aa.png)\n",
"\n",
"Additionally, you have the option to compare multiple executions side by side:\n",
"\n",
"![](https://user-images.githubusercontent.com/13848158/227784994-699b24b7-e69b-48f9-9ffa-e6a6142fd719.png)\n",
"\n",
"Aim is fully open source, [learn more](https://github.com/aimhubio/aim) about Aim on GitHub.\n",
"\n",
"Let's move forward and see how to enable and configure Aim callback."
],
"id": "613b5312"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Tracking LangChain Executions with Aim</h3>"
],
"id": "3615f1e2"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook we will explore three usage scenarios. To start off, we will install the necessary packages and import certain modules. Subsequently, we will configure two environment variables that can be established either within the Python script or through the terminal."
],
"id": "5d271566"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "mf88kuCJhbVu"
},
"outputs": [],
"source": [
"!pip install aim\n",
"!pip install langchain\n",
"!pip install openai\n",
"!pip install google-search-results"
],
"id": "d16e00da"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "g4eTuajwfl6L"
},
"outputs": [],
"source": [
"import os\n",
"from datetime import datetime\n",
"\n",
"from langchain.llms import OpenAI\n",
"from langchain.callbacks import AimCallbackHandler, StdOutCallbackHandler"
],
"id": "c970cda9"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our examples use a GPT model as the LLM, and OpenAI offers an API for this purpose. You can obtain the key from the following link: https://platform.openai.com/account/api-keys .\n",
"\n",
"We will use the SerpApi to retrieve search results from Google. To acquire the SerpApi key, please go to https://serpapi.com/manage-api-key ."
],
"id": "426ecf0d"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "T1bSmKd6V2If"
},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = \"...\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"...\""
],
"id": "b2b1cfc2"
},
{
"cell_type": "markdown",
"metadata": {
"id": "QenUYuBZjIzc"
},
"source": [
"The event methods of `AimCallbackHandler` accept the LangChain module or agent as input and log at least the prompts and generated results, as well as the serialized version of the LangChain module, to the designated Aim run."
],
"id": "53070869"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "KAz8weWuUeXF"
},
"outputs": [],
"source": [
"session_group = datetime.now().strftime(\"%m.%d.%Y_%H.%M.%S\")\n",
"aim_callback = AimCallbackHandler(\n",
" repo=\".\",\n",
" experiment_name=\"scenario 1: OpenAI LLM\",\n",
")\n",
"\n",
"callbacks = [StdOutCallbackHandler(), aim_callback]\n",
"llm = OpenAI(temperature=0, callbacks=callbacks)"
],
"id": "3a30e90d"
},
{
"cell_type": "markdown",
"metadata": {
"id": "b8WfByB4fl6N"
},
"source": [
"The `flush_tracker` function is used to record LangChain assets on Aim. By default, the session is reset rather than being terminated outright."
],
"id": "1f591582"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Scenario 1</h3> In the first scenario, we will use OpenAI LLM."
],
"id": "8a425743"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "o_VmneyIUyx8"
},
"outputs": [],
"source": [
"# scenario 1 - LLM\n",
"llm_result = llm.generate([\"Tell me a joke\", \"Tell me a poem\"] * 3)\n",
"aim_callback.flush_tracker(\n",
" langchain_asset=llm,\n",
" experiment_name=\"scenario 2: Chain with multiple SubChains on multiple generations\",\n",
")"
],
"id": "795cda48"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Scenario 2</h3> Scenario two involves chaining with multiple SubChains across multiple generations."
],
"id": "7374776f"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "trxslyb1U28Y"
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain"
],
"id": "f946249a"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "uauQk10SUzF6"
},
"outputs": [],
"source": [
"# scenario 2 - Chain\n",
"template = \"\"\"You are a playwright. Given the title of play, it is your job to write a synopsis for that title.\n",
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)\n",
"\n",
"test_prompts = [\n",
" {\n",
" \"title\": \"documentary about good video games that push the boundary of game design\"\n",
" },\n",
" {\"title\": \"the phenomenon behind the remarkable speed of cheetahs\"},\n",
" {\"title\": \"the best in class mlops tooling\"},\n",
"]\n",
"synopsis_chain.apply(test_prompts)\n",
"aim_callback.flush_tracker(\n",
" langchain_asset=synopsis_chain, experiment_name=\"scenario 3: Agent with Tools\"\n",
")"
],
"id": "1012e817"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Scenario 3</h3> The third scenario involves an agent with tools."
],
"id": "f18e2d10"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_jN73xcPVEpI"
},
"outputs": [],
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.agents import AgentType"
],
"id": "9de08db4"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Gpq4rk6VT9cu",
"outputId": "68ae261e-d0a2-4229-83c4-762562263b66"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.\n",
"Action: Search\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mLeonardo DiCaprio seemed to prove a long-held theory about his love life right after splitting from girlfriend Camila Morrone just months ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Camila Morrone's age\n",
"Action: Search\n",
"Action Input: \"Camila Morrone age\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.43 power\n",
"Action: Calculator\n",
"Action Input: 25^0.43\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.991298452658078.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"# scenario 3 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callbacks=callbacks,\n",
")\n",
"agent.run(\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
")\n",
"aim_callback.flush_tracker(langchain_asset=agent, reset=False, finish=True)"
],
"id": "0992df94"
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"provenance": []
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,29 @@
# Airbyte
>[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs,
> databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.
## Installation and Setup
This instruction shows how to load any source from `Airbyte` into a local `JSON` file that can be read in as a document.
**Prerequisites:**
Have `docker desktop` installed.
**Steps:**
1. Clone Airbyte from GitHub - `git clone https://github.com/airbytehq/airbyte.git`.
2. Switch into Airbyte directory - `cd airbyte`.
3. Start Airbyte - `docker compose up`.
4. In your browser, just visit http://localhost:8000. You will be asked for a username and password. By default, that's username `airbyte` and password `password`.
5. Setup any source you wish.
6. Set destination as Local JSON, with specified destination path - lets say `/json_data`. Set up a manual sync.
7. Run the connection.
8. To see what files are created, navigate to: `file:///tmp/airbyte_local/`.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/airbyte_json.html).
```python
from langchain.document_loaders import AirbyteJSONLoader
```
@@ -0,0 +1,28 @@
# Airtable
>[Airtable](https://en.wikipedia.org/wiki/Airtable) is a cloud collaboration service.
`Airtable` is a spreadsheet-database hybrid, with the features of a database but applied to a spreadsheet.
> The fields in an Airtable table are similar to cells in a spreadsheet, but have types such as 'checkbox',
> 'phone number', and 'drop-down list', and can reference file attachments like images.
>Users can create a database, set up column types, add records, link tables to one another, collaborate, sort records
> and publish views to external websites.
## Installation and Setup
```bash
pip install pyairtable
```
* Get your [API key](https://support.airtable.com/docs/creating-and-using-api-keys-and-access-tokens).
* Get the [ID of your base](https://airtable.com/developers/web/api/introduction).
* Get the [table ID from the table url](https://www.highviewapps.com/kb/where-can-i-find-the-airtable-base-id-and-table-id/#:~:text=Both%20the%20Airtable%20Base%20ID,URL%20that%20begins%20with%20tbl).
## Document Loader
```python
from langchain.document_loaders import AirtableLoader
```
See an [example](/docs/modules/data_connection/document_loaders/integrations/airtable.html).
@@ -0,0 +1,36 @@
# Aleph Alpha
>[Aleph Alpha](https://docs.aleph-alpha.com/) was founded in 2019 with the mission to research and build the foundational technology for an era of strong AI. The team of international scientists, engineers, and innovators researches, develops, and deploys transformative AI like large language and multimodal models and runs the fastest European commercial AI cluster.
>[The Luminous series](https://docs.aleph-alpha.com/docs/introduction/luminous/) is a family of large language models.
## Installation and Setup
```bash
pip install aleph-alpha-client
```
You have to create a new token. Please, see [instructions](https://docs.aleph-alpha.com/docs/account/#create-a-new-token).
```python
from getpass import getpass
ALEPH_ALPHA_API_KEY = getpass()
```
## LLM
See a [usage example](/docs/modules/model_io/models/llms/integrations/aleph_alpha.html).
```python
from langchain.llms import AlephAlpha
```
## Text Embedding Models
See a [usage example](/docs/modules/data_connection/text_embedding/integrations/aleph_alpha.html).
```python
from langchain.embeddings import AlephAlphaSymmetricSemanticEmbedding, AlephAlphaAsymmetricSemanticEmbedding
```
@@ -0,0 +1,28 @@
# Alibaba Cloud Opensearch
[Alibaba Cloud Opensearch](https://www.alibabacloud.com/product/opensearch) OpenSearch is a one-stop platform to develop intelligent search services. OpenSearch was built based on the large-scale distributed search engine developed by Alibaba. OpenSearch serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. OpenSearch helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises.
OpenSearch helps you develop high quality, maintenance-free, and high performance intelligent search services to provide your users with high search efficiency and accuracy.
OpenSearch provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This topic describes the syntax and usage notes of vector indexes.
## Purchase an instance and configure it
- Purchase OpenSearch Vector Search Edition from [Alibaba Cloud](https://opensearch.console.aliyun.com) and configure the instance according to the help [documentation](https://help.aliyun.com/document_detail/463198.html?spm=a2c4g.465092.0.0.2cd15002hdwavO).
## Alibaba Cloud Opensearch Vector Store Wrappers
supported functions:
- `add_texts`
- `add_documents`
- `from_texts`
- `from_documents`
- `similarity_search`
- `asimilarity_search`
- `similarity_search_by_vector`
- `asimilarity_search_by_vector`
- `similarity_search_with_relevance_scores`
For a more detailed walk through of the Alibaba Cloud OpenSearch wrapper, see [this notebook](../modules/indexes/vectorstores/examples/alibabacloud_opensearch.ipynb)
If you encounter any problems during use, please feel free to contact [xingshaomin.xsm@alibaba-inc.com](xingshaomin.xsm@alibaba-inc.com) , and we will do our best to provide you with assistance and support.
@@ -0,0 +1,73 @@
# Amazon API Gateway
[Amazon API Gateway](https://aws.amazon.com/api-gateway/) is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the "front door" for applications to access data, business logic, or functionality from your backend services. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.
API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, CORS support, authorization and access control, throttling, monitoring, and API version management. API Gateway has no minimum fees or startup costs. You pay for the API calls you receive and the amount of data transferred out and, with the API Gateway tiered pricing model, you can reduce your cost as your API usage scales.
## LLM
See a [usage example](/docs/modules/model_io/models/llms/integrations/amazon_api_gateway_example.html).
```python
from langchain.llms import AmazonAPIGateway
api_url = "https://<api_gateway_id>.execute-api.<region>.amazonaws.com/LATEST/HF"
llm = AmazonAPIGateway(api_url=api_url)
# These are sample parameters for Falcon 40B Instruct Deployed from Amazon SageMaker JumpStart
parameters = {
"max_new_tokens": 100,
"num_return_sequences": 1,
"top_k": 50,
"top_p": 0.95,
"do_sample": False,
"return_full_text": True,
"temperature": 0.2,
}
prompt = "what day comes after Friday?"
llm.model_kwargs = parameters
llm(prompt)
>>> 'what day comes after Friday?\nSaturday'
```
## Agent
```python
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import AmazonAPIGateway
api_url = "https://<api_gateway_id>.execute-api.<region>.amazonaws.com/LATEST/HF"
llm = AmazonAPIGateway(api_url=api_url)
parameters = {
"max_new_tokens": 50,
"num_return_sequences": 1,
"top_k": 250,
"top_p": 0.25,
"do_sample": False,
"temperature": 0.1,
}
llm.model_kwargs = parameters
# Next, let's load some tools to use. Note that the `llm-math` tool uses an LLM, so we need to pass that in.
tools = load_tools(["python_repl", "llm-math"], llm=llm)
# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)
# Now let's test it out!
agent.run("""
Write a Python script that prints "Hello, world!"
""")
>>> 'Hello, world!'
```
@@ -0,0 +1,15 @@
# AnalyticDB
This page covers how to use the AnalyticDB ecosystem within LangChain.
### VectorStore
There exists a wrapper around AnalyticDB, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import AnalyticDB
```
For a more detailed walkthrough of the AnalyticDB wrapper, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/analyticdb.html)
@@ -0,0 +1,18 @@
# Annoy
> [Annoy](https://github.com/spotify/annoy) (`Approximate Nearest Neighbors Oh Yeah`) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mmapped into memory so that many processes may share the same data.
## Installation and Setup
```bash
pip install annoy
```
## Vectorstore
See a [usage example](/docs/modules/data_connection/vectorstores/integrations/annoy.html).
```python
from langchain.vectorstores import Annoy
```
@@ -0,0 +1,17 @@
# Anyscale
This page covers how to use the Anyscale ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Anyscale wrappers.
## Installation and Setup
- Get an Anyscale Service URL, route and API key and set them as environment variables (`ANYSCALE_SERVICE_URL`,`ANYSCALE_SERVICE_ROUTE`, `ANYSCALE_SERVICE_TOKEN`).
- Please see [the Anyscale docs](https://docs.anyscale.com/productionize/services-v2/get-started) for more details.
## Wrappers
### LLM
There exists an Anyscale LLM wrapper, which you can access with
```python
from langchain.llms import Anyscale
```
@@ -0,0 +1,46 @@
# Apify
This page covers how to use [Apify](https://apify.com) within LangChain.
## Overview
Apify is a cloud platform for web scraping and data extraction,
which provides an [ecosystem](https://apify.com/store) of more than a thousand
ready-made apps called *Actors* for various scraping, crawling, and extraction use cases.
[![Apify Actors](/img/ApifyActors.png)](https://apify.com/store)
This integration enables you run Actors on the Apify platform and load their results into LangChain to feed your vector
indexes with documents and data from the web, e.g. to generate answers from websites with documentation,
blogs, or knowledge bases.
## Installation and Setup
- Install the Apify API client for Python with `pip install apify-client`
- Get your [Apify API token](https://console.apify.com/account/integrations) and either set it as
an environment variable (`APIFY_API_TOKEN`) or pass it to the `ApifyWrapper` as `apify_api_token` in the constructor.
## Wrappers
### Utility
You can use the `ApifyWrapper` to run Actors on the Apify platform.
```python
from langchain.utilities import ApifyWrapper
```
For a more detailed walkthrough of this wrapper, see [this notebook](/docs/modules/agents/tools/integrations/apify.html).
### Loader
You can also use our `ApifyDatasetLoader` to get data from Apify dataset.
```python
from langchain.document_loaders import ApifyDatasetLoader
```
For a more detailed walkthrough of this loader, see [this notebook](/docs/modules/data_connection/document_loaders/integrations/apify_dataset.html).
@@ -0,0 +1,29 @@
# Argilla
![Argilla - Open-source data platform for LLMs](https://argilla.io/og.png)
>[Argilla](https://argilla.io/) is an open-source data curation platform for LLMs.
> Using Argilla, everyone can build robust language models through faster data curation
> using both human and machine feedback. We provide support for each step in the MLOps cycle,
> from data labeling to model monitoring.
## Installation and Setup
First, you'll need to install the `argilla` Python package as follows:
```bash
pip install argilla --upgrade
```
If you already have an Argilla Server running, then you're good to go; but if
you don't, follow the next steps to install it.
If you don't you can refer to [Argilla - 🚀 Quickstart](https://docs.argilla.io/en/latest/getting_started/quickstart.html#Running-Argilla-Quickstart) to deploy Argilla either on HuggingFace Spaces, locally, or on a server.
## Tracking
See a [usage example of `ArgillaCallbackHandler`](/docs/modules/callbacks/integrations/argilla.html).
```python
from langchain.callbacks import ArgillaCallbackHandler
```
@@ -0,0 +1,199 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Arthur"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Arthur](https://arthur.ai) is a model monitoring and observability platform.\n",
"\n",
"The following guide shows how to run a registered chat LLM with the Arthur callback handler to automatically log model inferences to Arthur.\n",
"\n",
"If you do not have a model currently onboarded to Arthur, visit our [onboarding guide for generative text models](https://docs.arthur.ai/user-guide/walkthroughs/model-onboarding/generative_text_onboarding.html). For more information about how to use the Arthur SDK, visit our [docs](https://docs.arthur.ai/)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "y8ku6X96sebl"
},
"outputs": [],
"source": [
"from langchain.callbacks import ArthurCallbackHandler\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.schema import HumanMessage"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Place Arthur credentials here"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "Me3prhqjsoqz"
},
"outputs": [],
"source": [
"arthur_url = \"https://app.arthur.ai\"\n",
"arthur_login = \"your-arthur-login-username-here\"\n",
"arthur_model_id = \"your-arthur-model-id-here\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create Langchain LLM with Arthur callback handler"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "9Hq9snQasynA"
},
"outputs": [],
"source": [
"def make_langchain_chat_llm(chat_model=):\n",
" return ChatOpenAI(\n",
" streaming=True,\n",
" temperature=0.1,\n",
" callbacks=[\n",
" StreamingStdOutCallbackHandler(),\n",
" ArthurCallbackHandler.from_credentials(\n",
" arthur_model_id, \n",
" arthur_url=arthur_url, \n",
" arthur_login=arthur_login)\n",
" ])"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Please enter password for admin: ········\n"
]
}
],
"source": [
"chatgpt = make_langchain_chat_llm()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aXRyj50Ls8eP"
},
"source": [
"Running the chat LLM with this `run` function will save the chat history in an ongoing list so that the conversation can reference earlier messages and log each response to the Arthur platform. You can view the history of this model's inferences on your [model dashboard page](https://app.arthur.ai/).\n",
"\n",
"Enter `q` to quit the run loop"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"id": "4taWSbN-s31Y"
},
"outputs": [],
"source": [
"def run(llm):\n",
" history = []\n",
" while True:\n",
" user_input = input(\"\\n>>> input >>>\\n>>>: \")\n",
" if user_input == \"q\":\n",
" break\n",
" history.append(HumanMessage(content=user_input))\n",
" history.append(llm(history))"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"id": "MEx8nWJps-EG"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
">>> input >>>\n",
">>>: What is a callback handler?\n",
"A callback handler, also known as a callback function or callback method, is a piece of code that is executed in response to a specific event or condition. It is commonly used in programming languages that support event-driven or asynchronous programming paradigms.\n",
"\n",
"The purpose of a callback handler is to provide a way for developers to define custom behavior that should be executed when a certain event occurs. Instead of waiting for a result or blocking the execution, the program registers a callback function and continues with other tasks. When the event is triggered, the callback function is invoked, allowing the program to respond accordingly.\n",
"\n",
"Callback handlers are commonly used in various scenarios, such as handling user input, responding to network requests, processing asynchronous operations, and implementing event-driven architectures. They provide a flexible and modular way to handle events and decouple different components of a system.\n",
">>> input >>>\n",
">>>: What do I need to do to get the full benefits of this\n",
"To get the full benefits of using a callback handler, you should consider the following:\n",
"\n",
"1. Understand the event or condition: Identify the specific event or condition that you want to respond to with a callback handler. This could be user input, network requests, or any other asynchronous operation.\n",
"\n",
"2. Define the callback function: Create a function that will be executed when the event or condition occurs. This function should contain the desired behavior or actions you want to take in response to the event.\n",
"\n",
"3. Register the callback function: Depending on the programming language or framework you are using, you may need to register or attach the callback function to the appropriate event or condition. This ensures that the callback function is invoked when the event occurs.\n",
"\n",
"4. Handle the callback: Implement the necessary logic within the callback function to handle the event or condition. This could involve updating the user interface, processing data, making further requests, or triggering other actions.\n",
"\n",
"5. Consider error handling: It's important to handle any potential errors or exceptions that may occur within the callback function. This ensures that your program can gracefully handle unexpected situations and prevent crashes or undesired behavior.\n",
"\n",
"6. Maintain code readability and modularity: As your codebase grows, it's crucial to keep your callback handlers organized and maintainable. Consider using design patterns or architectural principles to structure your code in a modular and scalable way.\n",
"\n",
"By following these steps, you can leverage the benefits of callback handlers, such as asynchronous and event-driven programming, improved responsiveness, and modular code design.\n",
">>> input >>>\n",
">>>: q\n"
]
}
],
"source": [
"run(chatgpt)"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
@@ -0,0 +1,36 @@
# Arxiv
>[arXiv](https://arxiv.org/) is an open-access archive for 2 million scholarly articles in the fields of physics,
> mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and
> systems science, and economics.
## Installation and Setup
First, you need to install `arxiv` python package.
```bash
pip install arxiv
```
Second, you need to install `PyMuPDF` python package which transforms PDF files downloaded from the `arxiv.org` site into the text format.
```bash
pip install pymupdf
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/arxiv.html).
```python
from langchain.document_loaders import ArxivLoader
```
## Retriever
See a [usage example](/docs/modules/data_connection/retrievers/integrations/arxiv.html).
```python
from langchain.retrievers import ArxivRetriever
```
@@ -0,0 +1,27 @@
# AtlasDB
This page covers how to use Nomic's Atlas ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Atlas wrappers.
## Installation and Setup
- Install the Python package with `pip install nomic`
- Nomic is also included in langchains poetry extras `poetry install -E all`
## Wrappers
### VectorStore
There exists a wrapper around the Atlas neural database, allowing you to use it as a vectorstore.
This vectorstore also gives you full access to the underlying AtlasProject object, which will allow you to use the full range of Atlas map interactions, such as bulk tagging and automatic topic modeling.
Please see [the Atlas docs](https://docs.nomic.ai/atlas_api.html) for more detailed information.
To import this vectorstore:
```python
from langchain.vectorstores import AtlasDB
```
For a more detailed walkthrough of the AtlasDB wrapper, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/atlas.html)
@@ -0,0 +1,21 @@
# AwaDB
>[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.
## Installation and Setup
```bash
pip install awadb
```
## VectorStore
There exists a wrapper around AwaDB vector databases, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
```python
from langchain.vectorstores import AwaDB
```
For a more detailed walkthrough of the AwaDB wrapper, see [here](/docs/modules/data_connection/vectorstores/integrations/awadb.html).
@@ -0,0 +1,25 @@
# AWS S3 Directory
>[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html) is an object storage service.
>[AWS S3 Directory](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html)
>[AWS S3 Buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html)
## Installation and Setup
```bash
pip install boto3
```
## Document Loader
See a [usage example for S3DirectoryLoader](/docs/modules/data_connection/document_loaders/integrations/aws_s3_directory.html).
See a [usage example for S3FileLoader](/docs/modules/data_connection/document_loaders/integrations/aws_s3_file.html).
```python
from langchain.document_loaders import S3DirectoryLoader, S3FileLoader
```
@@ -0,0 +1,16 @@
# AZLyrics
>[AZLyrics](https://www.azlyrics.com/) is a large, legal, every day growing collection of lyrics.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/azlyrics.html).
```python
from langchain.document_loaders import AZLyricsLoader
```
@@ -0,0 +1,36 @@
# Azure Blob Storage
>[Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction) is Microsoft's object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data.
>[Azure Files](https://learn.microsoft.com/en-us/azure/storage/files/storage-files-introduction) offers fully managed
> file shares in the cloud that are accessible via the industry standard Server Message Block (`SMB`) protocol,
> Network File System (`NFS`) protocol, and `Azure Files REST API`. `Azure Files` are based on the `Azure Blob Storage`.
`Azure Blob Storage` is designed for:
- Serving images or documents directly to a browser.
- Storing files for distributed access.
- Streaming video and audio.
- Writing to log files.
- Storing data for backup and restore, disaster recovery, and archiving.
- Storing data for analysis by an on-premises or Azure-hosted service.
## Installation and Setup
```bash
pip install azure-storage-blob
```
## Document Loader
See a [usage example for the Azure Blob Storage](/docs/modules/data_connection/document_loaders/integrations/azure_blob_storage_container.html).
```python
from langchain.document_loaders import AzureBlobStorageContainerLoader
```
See a [usage example for the Azure Files](/docs/modules/data_connection/document_loaders/integrations/azure_blob_storage_file.html).
```python
from langchain.document_loaders import AzureBlobStorageFileLoader
```
@@ -0,0 +1,24 @@
# Azure Cognitive Search
>[Azure Cognitive Search](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search) (formerly known as `Azure Search`) is a cloud search service that gives developers infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.
>Search is foundational to any app that surfaces text to users, where common scenarios include catalog or document search, online retail apps, or data exploration over proprietary content. When you create a search service, you'll work with the following capabilities:
>- A search engine for full text search over a search index containing user-owned content
>- Rich indexing, with lexical analysis and optional AI enrichment for content extraction and transformation
>- Rich query syntax for text search, fuzzy search, autocomplete, geo-search and more
>- Programmability through REST APIs and client libraries in Azure SDKs
>- Azure integration at the data layer, machine learning layer, and AI (Cognitive Services)
## Installation and Setup
See [set up instructions](https://learn.microsoft.com/en-us/azure/search/search-create-service-portal).
## Retriever
See a [usage example](/docs/modules/data_connection/retrievers/integrations/azure_cognitive_search.html).
```python
from langchain.retrievers import AzureCognitiveSearchRetriever
```
@@ -0,0 +1,50 @@
# Azure OpenAI
>[Microsoft Azure](https://en.wikipedia.org/wiki/Microsoft_Azure), often referred to as `Azure` is a cloud computing platform run by `Microsoft`, which offers access, management, and development of applications and services through global data centers. It provides a range of capabilities, including software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). `Microsoft Azure` supports many programming languages, tools, and frameworks, including Microsoft-specific and third-party software and systems.
>[Azure OpenAI](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/) is an `Azure` service with powerful language models from `OpenAI` including the `GPT-3`, `Codex` and `Embeddings model` series for content generation, summarization, semantic search, and natural language to code translation.
## Installation and Setup
```bash
pip install openai
pip install tiktoken
```
Set the environment variables to get access to the `Azure OpenAI` service.
```python
import os
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["OPENAI_API_BASE"] = "https://<your-endpoint.openai.azure.com/"
os.environ["OPENAI_API_KEY"] = "your AzureOpenAI key"
os.environ["OPENAI_API_VERSION"] = "2023-05-15"
```
## LLM
See a [usage example](/docs/modules/model_io/models/llms/integrations/azure_openai_example.html).
```python
from langchain.llms import AzureOpenAI
```
## Text Embedding Models
See a [usage example](/docs/modules/data_connection/text_embedding/integrations/azureopenai.html)
```python
from langchain.embeddings import OpenAIEmbeddings
```
## Chat Models
See a [usage example](/docs/modules/model_io/models/chat/integrations/azure_chat_openai.html)
```python
from langchain.chat_models import AzureChatOpenAI
```
@@ -0,0 +1,79 @@
# Banana
This page covers how to use the Banana ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Banana wrappers.
## Installation and Setup
- Install with `pip install banana-dev`
- Get an Banana api key and set it as an environment variable (`BANANA_API_KEY`)
## Define your Banana Template
If you want to use an available language model template you can find one [here](https://app.banana.dev/templates/conceptofmind/serverless-template-palmyra-base).
This template uses the Palmyra-Base model by [Writer](https://writer.com/product/api/).
You can check out an example Banana repository [here](https://github.com/conceptofmind/serverless-template-palmyra-base).
## Build the Banana app
Banana Apps must include the "output" key in the return json.
There is a rigid response structure.
```python
# Return the results as a dictionary
result = {'output': result}
```
An example inference function would be:
```python
def inference(model_inputs:dict) -> dict:
global model
global tokenizer
# Parse out your arguments
prompt = model_inputs.get('prompt', None)
if prompt == None:
return {'message': "No prompt provided"}
# Run the model
input_ids = tokenizer.encode(prompt, return_tensors='pt').cuda()
output = model.generate(
input_ids,
max_length=100,
do_sample=True,
top_k=50,
top_p=0.95,
num_return_sequences=1,
temperature=0.9,
early_stopping=True,
no_repeat_ngram_size=3,
num_beams=5,
length_penalty=1.5,
repetition_penalty=1.5,
bad_words_ids=[[tokenizer.encode(' ', add_prefix_space=True)[0]]]
)
result = tokenizer.decode(output[0], skip_special_tokens=True)
# Return the results as a dictionary
result = {'output': result}
return result
```
You can find a full example of a Banana app [here](https://github.com/conceptofmind/serverless-template-palmyra-base/blob/main/app.py).
## Wrappers
### LLM
There exists an Banana LLM wrapper, which you can access with
```python
from langchain.llms import Banana
```
You need to provide a model key located in the dashboard:
```python
llm = Banana(model_key="YOUR_MODEL_KEY")
```
@@ -0,0 +1,25 @@
# Baseten
Learn how to use LangChain with models deployed on Baseten.
## Installation and setup
- Create a [Baseten](https://baseten.co) account and [API key](https://docs.baseten.co/settings/api-keys).
- Install the Baseten Python client with `pip install baseten`
- Use your API key to authenticate with `baseten login`
## Invoking a model
Baseten integrates with LangChain through the LLM module, which provides a standardized and interoperable interface for models that are deployed on your Baseten workspace.
You can deploy foundation models like WizardLM and Alpaca with one click from the [Baseten model library](https://app.baseten.co/explore/) or if you have your own model, [deploy it with this tutorial](https://docs.baseten.co/deploying-models/deploy).
In this example, we'll work with WizardLM. [Deploy WizardLM here](https://app.baseten.co/explore/wizardlm) and follow along with the deployed [model's version ID](https://docs.baseten.co/managing-models/manage).
```python
from langchain.llms import Baseten
wizardlm = Baseten(model="MODEL_VERSION_ID", verbose=True)
wizardlm("What is the difference between a Wizard and a Sorcerer?")
```
@@ -0,0 +1,92 @@
# Beam
This page covers how to use Beam within LangChain.
It is broken into two parts: installation and setup, and then references to specific Beam wrappers.
## Installation and Setup
- [Create an account](https://www.beam.cloud/)
- Install the Beam CLI with `curl https://raw.githubusercontent.com/slai-labs/get-beam/main/get-beam.sh -sSfL | sh`
- Register API keys with `beam configure`
- Set environment variables (`BEAM_CLIENT_ID`) and (`BEAM_CLIENT_SECRET`)
- Install the Beam SDK `pip install beam-sdk`
## Wrappers
### LLM
There exists a Beam LLM wrapper, which you can access with
```python
from langchain.llms.beam import Beam
```
## Define your Beam app.
This is the environment youll be developing against once you start the app.
It's also used to define the maximum response length from the model.
```python
llm = Beam(model_name="gpt2",
name="langchain-gpt2-test",
cpu=8,
memory="32Gi",
gpu="A10G",
python_version="python3.8",
python_packages=[
"diffusers[torch]>=0.10",
"transformers",
"torch",
"pillow",
"accelerate",
"safetensors",
"xformers",],
max_length="50",
verbose=False)
```
## Deploy your Beam app
Once defined, you can deploy your Beam app by calling your model's `_deploy()` method.
```python
llm._deploy()
```
## Call your Beam app
Once a beam model is deployed, it can be called by callying your model's `_call()` method.
This returns the GPT2 text response to your prompt.
```python
response = llm._call("Running machine learning on a remote GPU")
```
An example script which deploys the model and calls it would be:
```python
from langchain.llms.beam import Beam
import time
llm = Beam(model_name="gpt2",
name="langchain-gpt2-test",
cpu=8,
memory="32Gi",
gpu="A10G",
python_version="python3.8",
python_packages=[
"diffusers[torch]>=0.10",
"transformers",
"torch",
"pillow",
"accelerate",
"safetensors",
"xformers",],
max_length="50",
verbose=False)
llm._deploy()
response = llm._call("Running machine learning on a remote GPU")
print(response)
```
@@ -0,0 +1,24 @@
# Bedrock
>[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case.
## Installation and Setup
```bash
pip install boto3
```
## LLM
See a [usage example](/docs/modules/model_io/models/llms/integrations/bedrock.html).
```python
from langchain import Bedrock
```
## Text Embedding Models
See a [usage example](/docs/modules/data_connection/text_embedding/integrations/bedrock.html).
```python
from langchain.embeddings import BedrockEmbeddings
```
@@ -0,0 +1,17 @@
# BiliBili
>[Bilibili](https://www.bilibili.tv/) is one of the most beloved long-form video sites in China.
## Installation and Setup
```bash
pip install bilibili-api-python
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/bilibili.html).
```python
from langchain.document_loaders import BiliBiliLoader
```
@@ -0,0 +1,22 @@
# Blackboard
>[Blackboard Learn](https://en.wikipedia.org/wiki/Blackboard_Learn) (previously the `Blackboard Learning Management System`)
> is a web-based virtual learning environment and learning management system developed by Blackboard Inc.
> The software features course management, customizable open architecture, and scalable design that allows
> integration with student information systems and authentication protocols. It may be installed on local servers,
> hosted by `Blackboard ASP Solutions`, or provided as Software as a Service hosted on Amazon Web Services.
> Its main purposes are stated to include the addition of online elements to courses traditionally delivered
> face-to-face and development of completely online courses with few or no face-to-face meetings.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/blackboard.html).
```python
from langchain.document_loaders import BlackboardLoader
```
@@ -0,0 +1,36 @@
# Brave Search
>[Brave Search](https://en.wikipedia.org/wiki/Brave_Search) is a search engine developed by Brave Software.
> - `Brave Search` uses its own web index. As of May 2022, it covered over 10 billion pages and was used to serve 92%
> of search results without relying on any third-parties, with the remainder being retrieved
> server-side from the Bing API or (on an opt-in basis) client-side from Google. According
> to Brave, the index was kept "intentionally smaller than that of Google or Bing" in order to
> help avoid spam and other low-quality content, with the disadvantage that "Brave Search is
> not yet as good as Google in recovering long-tail queries."
>- `Brave Search Premium`: As of April 2023 Brave Search is an ad-free website, but it will
> eventually switch to a new model that will include ads and premium users will get an ad-free experience.
> User data including IP addresses won't be collected from its users by default. A premium account
> will be required for opt-in data-collection.
## Installation and Setup
To get access to the Brave Search API, you need to [create an account and get an API key](https://api.search.brave.com/app/dashboard).
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/brave_search.html).
```python
from langchain.document_loaders import BraveSearchLoader
```
## Tool
See a [usage example](/docs/modules/agents/tools/integrations/brave_search.html).
```python
from langchain.tools import BraveSearch
```
@@ -0,0 +1,35 @@
# Cassandra
>[Apache Cassandra®](https://cassandra.apache.org/) is a free and open-source, distributed, wide-column
> store, NoSQL database management system designed to handle large amounts of data across many commodity servers,
> providing high availability with no single point of failure. Cassandra offers support for clusters spanning
> multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients.
> Cassandra was designed to implement a combination of _Amazon's Dynamo_ distributed storage and replication
> techniques combined with _Google's Bigtable_ data and storage engine model.
## Installation and Setup
```bash
pip install cassandra-driver
pip install cassio
```
## Vector Store
See a [usage example](/docs/modules/data_connection/vectorstores/integrations/cassandra.html).
```python
from langchain.memory import CassandraChatMessageHistory
```
## Memory
See a [usage example](/docs/modules/memory/integrations/cassandra_chat_message_history.html).
```python
from langchain.memory import CassandraChatMessageHistory
```
@@ -0,0 +1,17 @@
# CerebriumAI
This page covers how to use the CerebriumAI ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific CerebriumAI wrappers.
## Installation and Setup
- Install with `pip install cerebrium`
- Get an CerebriumAI api key and set it as an environment variable (`CEREBRIUMAI_API_KEY`)
## Wrappers
### LLM
There exists an CerebriumAI LLM wrapper, which you can access with
```python
from langchain.llms import CerebriumAI
```
@@ -0,0 +1,17 @@
# Chaindesk
>[Chaindesk](https://chaindesk.ai) is an [open source](https://github.com/gmpetrov/databerry) document retrieval platform that helps to connect your personal data with Large Language Models.
## Installation and Setup
We need to sign up for Chaindesk, create a datastore, add some data and get your datastore api endpoint url.
We need the [API Key](https://docs.chaindesk.ai/api-reference/authentication).
## Retriever
See a [usage example](/docs/modules/data_connection/retrievers/integrations/chaindesk.html).
```python
from langchain.retrievers import ChaindeskRetriever
```
@@ -0,0 +1,29 @@
# Chroma
>[Chroma](https://docs.trychroma.com/getting-started) is a database for building AI applications with embeddings.
## Installation and Setup
```bash
pip install chromadb
```
## VectorStore
There exists a wrapper around Chroma vector databases, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
```python
from langchain.vectorstores import Chroma
```
For a more detailed walkthrough of the Chroma wrapper, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/chroma.html)
## Retriever
See a [usage example](/docs/modules/data_connection/retrievers/how_to/self_query/chroma_self_query.html).
```python
from langchain.retrievers import SelfQueryRetriever
```
@@ -0,0 +1,52 @@
# Clarifai
>[Clarifai](https://clarifai.com) is one of first deep learning platforms having been founded in 2013. Clarifai provides an AI platform with the full AI lifecycle for data exploration, data labeling, model training, evaluation and inference around images, video, text and audio data. In the LangChain ecosystem, as far as we're aware, Clarifai is the only provider that supports LLMs, embeddings and a vector store in one production scale platform, making it an excellent choice to operationalize your LangChain implementations.
## Installation and Setup
- Install the Python SDK:
```bash
pip install clarifai
```
[Sign-up](https://clarifai.com/signup) for a Clarifai account, then get a personal access token to access the Clarifai API from your [security settings](https://clarifai.com/settings/security) and set it as an environment variable (`CLARIFAI_PAT`).
## Models
Clarifai provides 1,000s of AI models for many different use cases. You can [explore them here](https://clarifai.com/explore) to find the one most suited for your use case. These models include those created by other providers such as OpenAI, Anthropic, Cohere, AI21, etc. as well as state of the art from open source such as Falcon, InstructorXL, etc. so that you build the best in AI into your products. You'll find these organized by the creator's user_id and into projects we call applications denoted by their app_id. Those IDs will be needed in additional to the model_id and optionally the version_id, so make note of all these IDs once you found the best model for your use case!
Also note that given there are many models for images, video, text and audio understanding, you can build some interested AI agents that utilize the variety of AI models as experts to understand those data types.
### LLMs
To find the selection of LLMs in the Clarifai platform you can select the text to text model type [here](https://clarifai.com/explore/models?filterData=%5B%7B%22field%22%3A%22model_type_id%22%2C%22value%22%3A%5B%22text-to-text%22%5D%7D%5D&page=1&perPage=24).
```python
from langchain.llms import Clarifai
llm = Clarifai(pat=CLARIFAI_PAT, user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)
```
For more details, the docs on the Clarifai LLM wrapper provide a [detailed walkthrough](/docs/modules/model_io/models/llms/integrations/clarifai.html).
### Text Embedding Models
To find the selection of text embeddings models in the Clarifai platform you can select the text to embedding model type [here](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22model_type_id%22%2C%22value%22%3A%5B%22text-embedder%22%5D%7D%5D).
There is a Clarifai Embedding model in LangChain, which you can access with:
```python
from langchain.embeddings import ClarifaiEmbeddings
embeddings = ClarifaiEmbeddings(pat=CLARIFAI_PAT, user_id=USER_ID, app_id=APP_ID, model_id=MODEL_ID)
```
For more details, the docs on the Clarifai Embeddings wrapper provide a [detailed walthrough](/docs/modules/data_connection/text_embedding/integrations/clarifai.html).
## Vectorstore
Clarifai's vector DB was launched in 2016 and has been optimized to support live search queries. With workflows in the Clarifai platform, you data is automatically indexed by am embedding model and optionally other models as well to index that information in the DB for search. You can query the DB not only via the vectors but also filter by metadata matches, other AI predicted concepts, and even do geo-coordinate search. Simply create an application, select the appropriate base workflow for your type of data, and upload it (through the API as [documented here](https://docs.clarifai.com/api-guide/data/create-get-update-delete) or the UIs at clarifai.com).
You an also add data directly from LangChain as well, and the auto-indexing will take place for you. You'll notice this is a little different than other vectorstores where you need to provde an embedding model in their constructor and have LangChain coordinate getting the embeddings from text and writing those to the index. Not only is it more convenient, but it's much more scalable to use Clarifai's distributed cloud to do all the index in the background.
```python
from langchain.vectorstores import Clarifai
clarifai_vector_db = Clarifai.from_texts(user_id=USER_ID, app_id=APP_ID, texts=texts, pat=CLARIFAI_PAT, number_of_docs=NUMBER_OF_DOCS, metadatas = metadatas)
```
For more details, the docs on the Clarifai vector store provide a [detailed walthrough](/docs/modules/data_connection/text_embedding/integrations/clarifai.html).
@@ -0,0 +1,610 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ClearML\n",
"\n",
"> [ClearML](https://github.com/allegroai/clearml) is a ML/DL development and production suite, it contains 5 main modules:\n",
"> - `Experiment Manager` - Automagical experiment tracking, environments and results\n",
"> - `MLOps` - Orchestration, Automation & Pipelines solution for ML/DL jobs (K8s / Cloud / bare-metal)\n",
"> - `Data-Management` - Fully differentiable data management & version control solution on top of object-storage (S3 / GS / Azure / NAS)\n",
"> - `Model-Serving` - cloud-ready Scalable model serving solution!\n",
" Deploy new model endpoints in under 5 minutes\n",
" Includes optimized GPU serving support backed by Nvidia-Triton\n",
" with out-of-the-box Model Monitoring\n",
"> - `Fire Reports` - Create and share rich MarkDown documents supporting embeddable online content\n",
"\n",
"In order to properly keep track of your langchain experiments and their results, you can enable the `ClearML` integration. We use the `ClearML Experiment Manager` that neatly tracks and organizes all your experiment runs.\n",
"\n",
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/hwchase17/langchain/blob/master/docs/ecosystem/clearml_tracking.html\">\n",
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
"</a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Installation and Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install clearml\n",
"!pip install pandas\n",
"!pip install textstat\n",
"!pip install spacy\n",
"!python -m spacy download en_core_web_sm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Getting API Credentials\n",
"\n",
"We'll be using quite some APIs in this notebook, here is a list and where to get them:\n",
"\n",
"- ClearML: https://app.clear.ml/settings/workspace-configuration\n",
"- OpenAI: https://platform.openai.com/account/api-keys\n",
"- SerpAPI (google search): https://serpapi.com/dashboard"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"CLEARML_API_ACCESS_KEY\"] = \"\"\n",
"os.environ[\"CLEARML_API_SECRET_KEY\"] = \"\"\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Callbacks"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.callbacks import ClearMLCallbackHandler"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The clearml callback is currently in beta and is subject to change based on updates to `langchain`. Please report any issues to https://github.com/allegroai/clearml/issues with the tag `langchain`.\n"
]
}
],
"source": [
"from datetime import datetime\n",
"from langchain.callbacks import StdOutCallbackHandler\n",
"from langchain.llms import OpenAI\n",
"\n",
"# Setup and use the ClearML Callback\n",
"clearml_callback = ClearMLCallbackHandler(\n",
" task_type=\"inference\",\n",
" project_name=\"langchain_callback_demo\",\n",
" task_name=\"llm\",\n",
" tags=[\"test\"],\n",
" # Change the following parameters based on the amount of detail you want tracked\n",
" visualize=True,\n",
" complexity_metrics=True,\n",
" stream_logs=True,\n",
")\n",
"callbacks = [StdOutCallbackHandler(), clearml_callback]\n",
"# Get the OpenAI model ready to go\n",
"llm = OpenAI(temperature=0, callbacks=callbacks)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 1: Just an LLM\n",
"\n",
"First, let's just run a single LLM a few times and capture the resulting prompt-answer conversation in ClearML"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a joke'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a poem'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a joke'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a poem'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a joke'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a poem'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 109.04, 'flesch_kincaid_grade': 1.3, 'smog_index': 0.0, 'coleman_liau_index': -1.24, 'automated_readability_index': 0.3, 'dale_chall_readability_score': 5.5, 'difficult_words': 0, 'linsear_write_formula': 5.5, 'gunning_fog': 5.2, 'text_standard': '5th and 6th grade', 'fernandez_huerta': 133.58, 'szigriszt_pazos': 131.54, 'gutierrez_polini': 62.3, 'crawford': -0.2, 'gulpease_index': 79.8, 'osman': 116.91}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nRoses are red,\\nViolets are blue,\\nSugar is sweet,\\nAnd so are you.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 83.66, 'flesch_kincaid_grade': 4.8, 'smog_index': 0.0, 'coleman_liau_index': 3.23, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 6.71, 'difficult_words': 2, 'linsear_write_formula': 6.5, 'gunning_fog': 8.28, 'text_standard': '6th and 7th grade', 'fernandez_huerta': 115.58, 'szigriszt_pazos': 112.37, 'gutierrez_polini': 54.83, 'crawford': 1.4, 'gulpease_index': 72.1, 'osman': 100.17}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 109.04, 'flesch_kincaid_grade': 1.3, 'smog_index': 0.0, 'coleman_liau_index': -1.24, 'automated_readability_index': 0.3, 'dale_chall_readability_score': 5.5, 'difficult_words': 0, 'linsear_write_formula': 5.5, 'gunning_fog': 5.2, 'text_standard': '5th and 6th grade', 'fernandez_huerta': 133.58, 'szigriszt_pazos': 131.54, 'gutierrez_polini': 62.3, 'crawford': -0.2, 'gulpease_index': 79.8, 'osman': 116.91}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nRoses are red,\\nViolets are blue,\\nSugar is sweet,\\nAnd so are you.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 83.66, 'flesch_kincaid_grade': 4.8, 'smog_index': 0.0, 'coleman_liau_index': 3.23, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 6.71, 'difficult_words': 2, 'linsear_write_formula': 6.5, 'gunning_fog': 8.28, 'text_standard': '6th and 7th grade', 'fernandez_huerta': 115.58, 'szigriszt_pazos': 112.37, 'gutierrez_polini': 54.83, 'crawford': 1.4, 'gulpease_index': 72.1, 'osman': 100.17}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 109.04, 'flesch_kincaid_grade': 1.3, 'smog_index': 0.0, 'coleman_liau_index': -1.24, 'automated_readability_index': 0.3, 'dale_chall_readability_score': 5.5, 'difficult_words': 0, 'linsear_write_formula': 5.5, 'gunning_fog': 5.2, 'text_standard': '5th and 6th grade', 'fernandez_huerta': 133.58, 'szigriszt_pazos': 131.54, 'gutierrez_polini': 62.3, 'crawford': -0.2, 'gulpease_index': 79.8, 'osman': 116.91}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nRoses are red,\\nViolets are blue,\\nSugar is sweet,\\nAnd so are you.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 83.66, 'flesch_kincaid_grade': 4.8, 'smog_index': 0.0, 'coleman_liau_index': 3.23, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 6.71, 'difficult_words': 2, 'linsear_write_formula': 6.5, 'gunning_fog': 8.28, 'text_standard': '6th and 7th grade', 'fernandez_huerta': 115.58, 'szigriszt_pazos': 112.37, 'gutierrez_polini': 54.83, 'crawford': 1.4, 'gulpease_index': 72.1, 'osman': 100.17}\n",
"{'action_records': action name step starts ends errors text_ctr chain_starts \\\n",
"0 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"1 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"2 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"3 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"4 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"5 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"6 on_llm_end NaN 2 1 1 0 0 0 \n",
"7 on_llm_end NaN 2 1 1 0 0 0 \n",
"8 on_llm_end NaN 2 1 1 0 0 0 \n",
"9 on_llm_end NaN 2 1 1 0 0 0 \n",
"10 on_llm_end NaN 2 1 1 0 0 0 \n",
"11 on_llm_end NaN 2 1 1 0 0 0 \n",
"12 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"13 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"14 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"15 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"16 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"17 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"18 on_llm_end NaN 4 2 2 0 0 0 \n",
"19 on_llm_end NaN 4 2 2 0 0 0 \n",
"20 on_llm_end NaN 4 2 2 0 0 0 \n",
"21 on_llm_end NaN 4 2 2 0 0 0 \n",
"22 on_llm_end NaN 4 2 2 0 0 0 \n",
"23 on_llm_end NaN 4 2 2 0 0 0 \n",
"\n",
" chain_ends llm_starts ... difficult_words linsear_write_formula \\\n",
"0 0 1 ... NaN NaN \n",
"1 0 1 ... NaN NaN \n",
"2 0 1 ... NaN NaN \n",
"3 0 1 ... NaN NaN \n",
"4 0 1 ... NaN NaN \n",
"5 0 1 ... NaN NaN \n",
"6 0 1 ... 0.0 5.5 \n",
"7 0 1 ... 2.0 6.5 \n",
"8 0 1 ... 0.0 5.5 \n",
"9 0 1 ... 2.0 6.5 \n",
"10 0 1 ... 0.0 5.5 \n",
"11 0 1 ... 2.0 6.5 \n",
"12 0 2 ... NaN NaN \n",
"13 0 2 ... NaN NaN \n",
"14 0 2 ... NaN NaN \n",
"15 0 2 ... NaN NaN \n",
"16 0 2 ... NaN NaN \n",
"17 0 2 ... NaN NaN \n",
"18 0 2 ... 0.0 5.5 \n",
"19 0 2 ... 2.0 6.5 \n",
"20 0 2 ... 0.0 5.5 \n",
"21 0 2 ... 2.0 6.5 \n",
"22 0 2 ... 0.0 5.5 \n",
"23 0 2 ... 2.0 6.5 \n",
"\n",
" gunning_fog text_standard fernandez_huerta szigriszt_pazos \\\n",
"0 NaN NaN NaN NaN \n",
"1 NaN NaN NaN NaN \n",
"2 NaN NaN NaN NaN \n",
"3 NaN NaN NaN NaN \n",
"4 NaN NaN NaN NaN \n",
"5 NaN NaN NaN NaN \n",
"6 5.20 5th and 6th grade 133.58 131.54 \n",
"7 8.28 6th and 7th grade 115.58 112.37 \n",
"8 5.20 5th and 6th grade 133.58 131.54 \n",
"9 8.28 6th and 7th grade 115.58 112.37 \n",
"10 5.20 5th and 6th grade 133.58 131.54 \n",
"11 8.28 6th and 7th grade 115.58 112.37 \n",
"12 NaN NaN NaN NaN \n",
"13 NaN NaN NaN NaN \n",
"14 NaN NaN NaN NaN \n",
"15 NaN NaN NaN NaN \n",
"16 NaN NaN NaN NaN \n",
"17 NaN NaN NaN NaN \n",
"18 5.20 5th and 6th grade 133.58 131.54 \n",
"19 8.28 6th and 7th grade 115.58 112.37 \n",
"20 5.20 5th and 6th grade 133.58 131.54 \n",
"21 8.28 6th and 7th grade 115.58 112.37 \n",
"22 5.20 5th and 6th grade 133.58 131.54 \n",
"23 8.28 6th and 7th grade 115.58 112.37 \n",
"\n",
" gutierrez_polini crawford gulpease_index osman \n",
"0 NaN NaN NaN NaN \n",
"1 NaN NaN NaN NaN \n",
"2 NaN NaN NaN NaN \n",
"3 NaN NaN NaN NaN \n",
"4 NaN NaN NaN NaN \n",
"5 NaN NaN NaN NaN \n",
"6 62.30 -0.2 79.8 116.91 \n",
"7 54.83 1.4 72.1 100.17 \n",
"8 62.30 -0.2 79.8 116.91 \n",
"9 54.83 1.4 72.1 100.17 \n",
"10 62.30 -0.2 79.8 116.91 \n",
"11 54.83 1.4 72.1 100.17 \n",
"12 NaN NaN NaN NaN \n",
"13 NaN NaN NaN NaN \n",
"14 NaN NaN NaN NaN \n",
"15 NaN NaN NaN NaN \n",
"16 NaN NaN NaN NaN \n",
"17 NaN NaN NaN NaN \n",
"18 62.30 -0.2 79.8 116.91 \n",
"19 54.83 1.4 72.1 100.17 \n",
"20 62.30 -0.2 79.8 116.91 \n",
"21 54.83 1.4 72.1 100.17 \n",
"22 62.30 -0.2 79.8 116.91 \n",
"23 54.83 1.4 72.1 100.17 \n",
"\n",
"[24 rows x 39 columns], 'session_analysis': prompt_step prompts name output_step \\\n",
"0 1 Tell me a joke OpenAI 2 \n",
"1 1 Tell me a poem OpenAI 2 \n",
"2 1 Tell me a joke OpenAI 2 \n",
"3 1 Tell me a poem OpenAI 2 \n",
"4 1 Tell me a joke OpenAI 2 \n",
"5 1 Tell me a poem OpenAI 2 \n",
"6 3 Tell me a joke OpenAI 4 \n",
"7 3 Tell me a poem OpenAI 4 \n",
"8 3 Tell me a joke OpenAI 4 \n",
"9 3 Tell me a poem OpenAI 4 \n",
"10 3 Tell me a joke OpenAI 4 \n",
"11 3 Tell me a poem OpenAI 4 \n",
"\n",
" output \\\n",
"0 \\n\\nQ: What did the fish say when it hit the w... \n",
"1 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"2 \\n\\nQ: What did the fish say when it hit the w... \n",
"3 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"4 \\n\\nQ: What did the fish say when it hit the w... \n",
"5 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"6 \\n\\nQ: What did the fish say when it hit the w... \n",
"7 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"8 \\n\\nQ: What did the fish say when it hit the w... \n",
"9 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"10 \\n\\nQ: What did the fish say when it hit the w... \n",
"11 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"\n",
" token_usage_total_tokens token_usage_prompt_tokens \\\n",
"0 162 24 \n",
"1 162 24 \n",
"2 162 24 \n",
"3 162 24 \n",
"4 162 24 \n",
"5 162 24 \n",
"6 162 24 \n",
"7 162 24 \n",
"8 162 24 \n",
"9 162 24 \n",
"10 162 24 \n",
"11 162 24 \n",
"\n",
" token_usage_completion_tokens flesch_reading_ease flesch_kincaid_grade \\\n",
"0 138 109.04 1.3 \n",
"1 138 83.66 4.8 \n",
"2 138 109.04 1.3 \n",
"3 138 83.66 4.8 \n",
"4 138 109.04 1.3 \n",
"5 138 83.66 4.8 \n",
"6 138 109.04 1.3 \n",
"7 138 83.66 4.8 \n",
"8 138 109.04 1.3 \n",
"9 138 83.66 4.8 \n",
"10 138 109.04 1.3 \n",
"11 138 83.66 4.8 \n",
"\n",
" ... difficult_words linsear_write_formula gunning_fog \\\n",
"0 ... 0 5.5 5.20 \n",
"1 ... 2 6.5 8.28 \n",
"2 ... 0 5.5 5.20 \n",
"3 ... 2 6.5 8.28 \n",
"4 ... 0 5.5 5.20 \n",
"5 ... 2 6.5 8.28 \n",
"6 ... 0 5.5 5.20 \n",
"7 ... 2 6.5 8.28 \n",
"8 ... 0 5.5 5.20 \n",
"9 ... 2 6.5 8.28 \n",
"10 ... 0 5.5 5.20 \n",
"11 ... 2 6.5 8.28 \n",
"\n",
" text_standard fernandez_huerta szigriszt_pazos gutierrez_polini \\\n",
"0 5th and 6th grade 133.58 131.54 62.30 \n",
"1 6th and 7th grade 115.58 112.37 54.83 \n",
"2 5th and 6th grade 133.58 131.54 62.30 \n",
"3 6th and 7th grade 115.58 112.37 54.83 \n",
"4 5th and 6th grade 133.58 131.54 62.30 \n",
"5 6th and 7th grade 115.58 112.37 54.83 \n",
"6 5th and 6th grade 133.58 131.54 62.30 \n",
"7 6th and 7th grade 115.58 112.37 54.83 \n",
"8 5th and 6th grade 133.58 131.54 62.30 \n",
"9 6th and 7th grade 115.58 112.37 54.83 \n",
"10 5th and 6th grade 133.58 131.54 62.30 \n",
"11 6th and 7th grade 115.58 112.37 54.83 \n",
"\n",
" crawford gulpease_index osman \n",
"0 -0.2 79.8 116.91 \n",
"1 1.4 72.1 100.17 \n",
"2 -0.2 79.8 116.91 \n",
"3 1.4 72.1 100.17 \n",
"4 -0.2 79.8 116.91 \n",
"5 1.4 72.1 100.17 \n",
"6 -0.2 79.8 116.91 \n",
"7 1.4 72.1 100.17 \n",
"8 -0.2 79.8 116.91 \n",
"9 1.4 72.1 100.17 \n",
"10 -0.2 79.8 116.91 \n",
"11 1.4 72.1 100.17 \n",
"\n",
"[12 rows x 24 columns]}\n",
"2023-03-29 14:00:25,948 - clearml.Task - INFO - Completed model upload to https://files.clear.ml/langchain_callback_demo/llm.988bd727b0e94a29a3ac0ee526813545/models/simple_sequential\n"
]
}
],
"source": [
"# SCENARIO 1 - LLM\n",
"llm_result = llm.generate([\"Tell me a joke\", \"Tell me a poem\"] * 3)\n",
"# After every generation run, use flush to make sure all the metrics\n",
"# prompts and other output are properly saved separately\n",
"clearml_callback.flush_tracker(langchain_asset=llm, name=\"simple_sequential\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"At this point you can already go to https://app.clear.ml and take a look at the resulting ClearML Task that was created.\n",
"\n",
"Among others, you should see that this notebook is saved along with any git information. The model JSON that contains the used parameters is saved as an artifact, there are also console logs and under the plots section, you'll find tables that represent the flow of the chain.\n",
"\n",
"Finally, if you enabled visualizations, these are stored as HTML files under debug samples."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 2: Creating an agent with tools\n",
"\n",
"To show a more advanced workflow, let's create an agent with access to tools. The way ClearML tracks the results is not different though, only the table will look slightly different as there are other types of actions taken when compared to the earlier, simpler example.\n",
"\n",
"You can now also see the use of the `finish=True` keyword, which will fully close the ClearML Task, instead of just resetting the parameters and prompts for a new conversation."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"{'action': 'on_chain_start', 'name': 'AgentExecutor', 'step': 1, 'starts': 1, 'ends': 0, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 0, 'llm_ends': 0, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'input': 'Who is the wife of the person who sang summer of 69?'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 2, 'starts': 2, 'ends': 0, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 0, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Answer the following questions as best you can. You have access to the following tools:\\n\\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\\nCalculator: Useful for when you need to answer questions about math.\\n\\nUse the following format:\\n\\nQuestion: the input question you must answer\\nThought: you should always think about what to do\\nAction: the action to take, should be one of [Search, Calculator]\\nAction Input: the input to the action\\nObservation: the result of the action\\n... (this Thought/Action/Action Input/Observation can repeat N times)\\nThought: I now know the final answer\\nFinal Answer: the final answer to the original input question\\n\\nBegin!\\n\\nQuestion: Who is the wife of the person who sang summer of 69?\\nThought:'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 189, 'token_usage_completion_tokens': 34, 'token_usage_total_tokens': 223, 'model_name': 'text-davinci-003', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': ' I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 91.61, 'flesch_kincaid_grade': 3.8, 'smog_index': 0.0, 'coleman_liau_index': 3.41, 'automated_readability_index': 3.5, 'dale_chall_readability_score': 6.06, 'difficult_words': 2, 'linsear_write_formula': 5.75, 'gunning_fog': 5.4, 'text_standard': '3rd and 4th grade', 'fernandez_huerta': 121.07, 'szigriszt_pazos': 119.5, 'gutierrez_polini': 54.91, 'crawford': 0.9, 'gulpease_index': 72.7, 'osman': 92.16}\n",
"\u001b[32;1m\u001b[1;3m I need to find out who sang summer of 69 and then find out who their wife is.\n",
"Action: Search\n",
"Action Input: \"Who sang summer of 69\"\u001b[0m{'action': 'on_agent_action', 'tool': 'Search', 'tool_input': 'Who sang summer of 69', 'log': ' I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"', 'step': 4, 'starts': 3, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 1, 'tool_ends': 0, 'agent_ends': 0}\n",
"{'action': 'on_tool_start', 'input_str': 'Who sang summer of 69', 'name': 'Search', 'description': 'A search engine. Useful for when you need to answer questions about current events. Input should be a search query.', 'step': 5, 'starts': 4, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 0, 'agent_ends': 0}\n",
"\n",
"Observation: \u001b[36;1m\u001b[1;3mBryan Adams - Summer Of 69 (Official Music Video).\u001b[0m\n",
"Thought:{'action': 'on_tool_end', 'output': 'Bryan Adams - Summer Of 69 (Official Music Video).', 'step': 6, 'starts': 4, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 1, 'agent_ends': 0}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 7, 'starts': 5, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 1, 'agent_ends': 0, 'prompts': 'Answer the following questions as best you can. You have access to the following tools:\\n\\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\\nCalculator: Useful for when you need to answer questions about math.\\n\\nUse the following format:\\n\\nQuestion: the input question you must answer\\nThought: you should always think about what to do\\nAction: the action to take, should be one of [Search, Calculator]\\nAction Input: the input to the action\\nObservation: the result of the action\\n... (this Thought/Action/Action Input/Observation can repeat N times)\\nThought: I now know the final answer\\nFinal Answer: the final answer to the original input question\\n\\nBegin!\\n\\nQuestion: Who is the wife of the person who sang summer of 69?\\nThought: I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"\\nObservation: Bryan Adams - Summer Of 69 (Official Music Video).\\nThought:'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 242, 'token_usage_completion_tokens': 28, 'token_usage_total_tokens': 270, 'model_name': 'text-davinci-003', 'step': 8, 'starts': 5, 'ends': 3, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 1, 'agent_ends': 0, 'text': ' I need to find out who Bryan Adams is married to.\\nAction: Search\\nAction Input: \"Who is Bryan Adams married to\"', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 94.66, 'flesch_kincaid_grade': 2.7, 'smog_index': 0.0, 'coleman_liau_index': 4.73, 'automated_readability_index': 4.0, 'dale_chall_readability_score': 7.16, 'difficult_words': 2, 'linsear_write_formula': 4.25, 'gunning_fog': 4.2, 'text_standard': '4th and 5th grade', 'fernandez_huerta': 124.13, 'szigriszt_pazos': 119.2, 'gutierrez_polini': 52.26, 'crawford': 0.7, 'gulpease_index': 74.7, 'osman': 84.2}\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Bryan Adams is married to.\n",
"Action: Search\n",
"Action Input: \"Who is Bryan Adams married to\"\u001b[0m{'action': 'on_agent_action', 'tool': 'Search', 'tool_input': 'Who is Bryan Adams married to', 'log': ' I need to find out who Bryan Adams is married to.\\nAction: Search\\nAction Input: \"Who is Bryan Adams married to\"', 'step': 9, 'starts': 6, 'ends': 3, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 3, 'tool_ends': 1, 'agent_ends': 0}\n",
"{'action': 'on_tool_start', 'input_str': 'Who is Bryan Adams married to', 'name': 'Search', 'description': 'A search engine. Useful for when you need to answer questions about current events. Input should be a search query.', 'step': 10, 'starts': 7, 'ends': 3, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 1, 'agent_ends': 0}\n",
"\n",
"Observation: \u001b[36;1m\u001b[1;3mBryan Adams has never married. In the 1990s, he was in a relationship with Danish model Cecilie Thomsen. In 2011, Bryan and Alicia Grimaldi, his ...\u001b[0m\n",
"Thought:{'action': 'on_tool_end', 'output': 'Bryan Adams has never married. In the 1990s, he was in a relationship with Danish model Cecilie Thomsen. In 2011, Bryan and Alicia Grimaldi, his ...', 'step': 11, 'starts': 7, 'ends': 4, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 0}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 12, 'starts': 8, 'ends': 4, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 3, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 0, 'prompts': 'Answer the following questions as best you can. You have access to the following tools:\\n\\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\\nCalculator: Useful for when you need to answer questions about math.\\n\\nUse the following format:\\n\\nQuestion: the input question you must answer\\nThought: you should always think about what to do\\nAction: the action to take, should be one of [Search, Calculator]\\nAction Input: the input to the action\\nObservation: the result of the action\\n... (this Thought/Action/Action Input/Observation can repeat N times)\\nThought: I now know the final answer\\nFinal Answer: the final answer to the original input question\\n\\nBegin!\\n\\nQuestion: Who is the wife of the person who sang summer of 69?\\nThought: I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"\\nObservation: Bryan Adams - Summer Of 69 (Official Music Video).\\nThought: I need to find out who Bryan Adams is married to.\\nAction: Search\\nAction Input: \"Who is Bryan Adams married to\"\\nObservation: Bryan Adams has never married. In the 1990s, he was in a relationship with Danish model Cecilie Thomsen. In 2011, Bryan and Alicia Grimaldi, his ...\\nThought:'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 314, 'token_usage_completion_tokens': 18, 'token_usage_total_tokens': 332, 'model_name': 'text-davinci-003', 'step': 13, 'starts': 8, 'ends': 5, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 3, 'llm_ends': 3, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 0, 'text': ' I now know the final answer.\\nFinal Answer: Bryan Adams has never been married.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 81.29, 'flesch_kincaid_grade': 3.7, 'smog_index': 0.0, 'coleman_liau_index': 5.75, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 7.37, 'difficult_words': 1, 'linsear_write_formula': 2.5, 'gunning_fog': 2.8, 'text_standard': '3rd and 4th grade', 'fernandez_huerta': 115.7, 'szigriszt_pazos': 110.84, 'gutierrez_polini': 49.79, 'crawford': 0.7, 'gulpease_index': 85.4, 'osman': 83.14}\n",
"\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Bryan Adams has never been married.\u001b[0m\n",
"{'action': 'on_agent_finish', 'output': 'Bryan Adams has never been married.', 'log': ' I now know the final answer.\\nFinal Answer: Bryan Adams has never been married.', 'step': 14, 'starts': 8, 'ends': 6, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 3, 'llm_ends': 3, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 1}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"{'action': 'on_chain_end', 'outputs': 'Bryan Adams has never been married.', 'step': 15, 'starts': 8, 'ends': 7, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 1, 'llm_starts': 3, 'llm_ends': 3, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 1}\n",
"{'action_records': action name step starts ends errors text_ctr \\\n",
"0 on_llm_start OpenAI 1 1 0 0 0 \n",
"1 on_llm_start OpenAI 1 1 0 0 0 \n",
"2 on_llm_start OpenAI 1 1 0 0 0 \n",
"3 on_llm_start OpenAI 1 1 0 0 0 \n",
"4 on_llm_start OpenAI 1 1 0 0 0 \n",
".. ... ... ... ... ... ... ... \n",
"66 on_tool_end NaN 11 7 4 0 0 \n",
"67 on_llm_start OpenAI 12 8 4 0 0 \n",
"68 on_llm_end NaN 13 8 5 0 0 \n",
"69 on_agent_finish NaN 14 8 6 0 0 \n",
"70 on_chain_end NaN 15 8 7 0 0 \n",
"\n",
" chain_starts chain_ends llm_starts ... gulpease_index osman input \\\n",
"0 0 0 1 ... NaN NaN NaN \n",
"1 0 0 1 ... NaN NaN NaN \n",
"2 0 0 1 ... NaN NaN NaN \n",
"3 0 0 1 ... NaN NaN NaN \n",
"4 0 0 1 ... NaN NaN NaN \n",
".. ... ... ... ... ... ... ... \n",
"66 1 0 2 ... NaN NaN NaN \n",
"67 1 0 3 ... NaN NaN NaN \n",
"68 1 0 3 ... 85.4 83.14 NaN \n",
"69 1 0 3 ... NaN NaN NaN \n",
"70 1 1 3 ... NaN NaN NaN \n",
"\n",
" tool tool_input log \\\n",
"0 NaN NaN NaN \n",
"1 NaN NaN NaN \n",
"2 NaN NaN NaN \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
".. ... ... ... \n",
"66 NaN NaN NaN \n",
"67 NaN NaN NaN \n",
"68 NaN NaN NaN \n",
"69 NaN NaN I now know the final answer.\\nFinal Answer: B... \n",
"70 NaN NaN NaN \n",
"\n",
" input_str description output \\\n",
"0 NaN NaN NaN \n",
"1 NaN NaN NaN \n",
"2 NaN NaN NaN \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
".. ... ... ... \n",
"66 NaN NaN Bryan Adams has never married. In the 1990s, h... \n",
"67 NaN NaN NaN \n",
"68 NaN NaN NaN \n",
"69 NaN NaN Bryan Adams has never been married. \n",
"70 NaN NaN NaN \n",
"\n",
" outputs \n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
".. ... \n",
"66 NaN \n",
"67 NaN \n",
"68 NaN \n",
"69 NaN \n",
"70 Bryan Adams has never been married. \n",
"\n",
"[71 rows x 47 columns], 'session_analysis': prompt_step prompts name \\\n",
"0 2 Answer the following questions as best you can... OpenAI \n",
"1 7 Answer the following questions as best you can... OpenAI \n",
"2 12 Answer the following questions as best you can... OpenAI \n",
"\n",
" output_step output \\\n",
"0 3 I need to find out who sang summer of 69 and ... \n",
"1 8 I need to find out who Bryan Adams is married... \n",
"2 13 I now know the final answer.\\nFinal Answer: B... \n",
"\n",
" token_usage_total_tokens token_usage_prompt_tokens \\\n",
"0 223 189 \n",
"1 270 242 \n",
"2 332 314 \n",
"\n",
" token_usage_completion_tokens flesch_reading_ease flesch_kincaid_grade \\\n",
"0 34 91.61 3.8 \n",
"1 28 94.66 2.7 \n",
"2 18 81.29 3.7 \n",
"\n",
" ... difficult_words linsear_write_formula gunning_fog \\\n",
"0 ... 2 5.75 5.4 \n",
"1 ... 2 4.25 4.2 \n",
"2 ... 1 2.50 2.8 \n",
"\n",
" text_standard fernandez_huerta szigriszt_pazos gutierrez_polini \\\n",
"0 3rd and 4th grade 121.07 119.50 54.91 \n",
"1 4th and 5th grade 124.13 119.20 52.26 \n",
"2 3rd and 4th grade 115.70 110.84 49.79 \n",
"\n",
" crawford gulpease_index osman \n",
"0 0.9 72.7 92.16 \n",
"1 0.7 74.7 84.20 \n",
"2 0.7 85.4 83.14 \n",
"\n",
"[3 rows x 24 columns]}\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Could not update last created model in Task 988bd727b0e94a29a3ac0ee526813545, Task status 'completed' cannot be updated\n"
]
}
],
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.agents import AgentType\n",
"\n",
"# SCENARIO 2 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callbacks=callbacks,\n",
")\n",
"agent.run(\"Who is the wife of the person who sang summer of 69?\")\n",
"clearml_callback.flush_tracker(\n",
" langchain_asset=agent, name=\"Agent with Tools\", finish=True\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tips and Next Steps\n",
"\n",
"- Make sure you always use a unique `name` argument for the `clearml_callback.flush_tracker` function. If not, the model parameters used for a run will override the previous run!\n",
"\n",
"- If you close the ClearML Callback using `clearml_callback.flush_tracker(..., finish=True)` the Callback cannot be used anymore. Make a new one if you want to keep logging.\n",
"\n",
"- Check out the rest of the open source ClearML ecosystem, there is a data version manager, a remote execution agent, automated pipelines and much more!\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"vscode": {
"interpreter": {
"hash": "a53ebf4a859167383b364e7e7521d0add3c2dbbdecce4edf676e8c4634ff3fbb"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}
@@ -0,0 +1,110 @@
# CnosDB
> [CnosDB](https://github.com/cnosdb/cnosdb) is an open source distributed time series database with high performance, high compression rate and high ease of use.
## Installation and Setup
```python
pip install cnos-connector
```
## Connecting to CnosDB
You can connect to CnosDB using the `SQLDatabase.from_cnosdb()` method.
### Syntax
```python
def SQLDatabase.from_cnosdb(url: str = "127.0.0.1:8902",
user: str = "root",
password: str = "",
tenant: str = "cnosdb",
database: str = "public")
```
Args:
1. url (str): The HTTP connection host name and port number of the CnosDB
service, excluding "http://" or "https://", with a default value
of "127.0.0.1:8902".
2. user (str): The username used to connect to the CnosDB service, with a
default value of "root".
3. password (str): The password of the user connecting to the CnosDB service,
with a default value of "".
4. tenant (str): The name of the tenant used to connect to the CnosDB service,
with a default value of "cnosdb".
5. database (str): The name of the database in the CnosDB tenant.
## Examples
```python
# Connecting to CnosDB with SQLDatabase Wrapper
from langchain import SQLDatabase
db = SQLDatabase.from_cnosdb()
```
```python
# Creating a OpenAI Chat LLM Wrapper
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")
```
### SQL Database Chain
This example demonstrates the use of the SQL Chain for answering a question over a CnosDB.
```python
from langchain import SQLDatabaseChain
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)
db_chain.run(
"What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?"
)
```
```shell
> Entering new chain...
What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?
SQLQuery:SELECT AVG(temperature) FROM air WHERE station = 'XiaoMaiDao' AND time >= '2022-10-19' AND time < '2022-10-20'
SQLResult: [(68.0,)]
Answer:The average temperature of air at station XiaoMaiDao between October 19, 2022 and October 20, 2022 is 68.0.
> Finished chain.
```
### SQL Database Agent
This example demonstrates the use of the SQL Database Agent for answering questions over a CnosDB.
```python
from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)
```
```python
agent.run(
"What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?"
)
```
```shell
> Entering new chain...
Action: sql_db_list_tables
Action Input: ""
Observation: air
Thought:The "air" table seems relevant to the question. I should query the schema of the "air" table to see what columns are available.
Action: sql_db_schema
Action Input: "air"
Observation:
CREATE TABLE air (
pressure FLOAT,
station STRING,
temperature FLOAT,
time TIMESTAMP,
visibility FLOAT
)
/*
3 rows from air table:
pressure station temperature time visibility
75.0 XiaoMaiDao 67.0 2022-10-19T03:40:00 54.0
77.0 XiaoMaiDao 69.0 2022-10-19T04:40:00 56.0
76.0 XiaoMaiDao 68.0 2022-10-19T05:40:00 55.0
*/
Thought:The "temperature" column in the "air" table is relevant to the question. I can query the average temperature between the specified dates.
Action: sql_db_query
Action Input: "SELECT AVG(temperature) FROM air WHERE station = 'XiaoMaiDao' AND time >= '2022-10-19' AND time <= '2022-10-20'"
Observation: [(68.0,)]
Thought:The average temperature of air at station XiaoMaiDao between October 19, 2022 and October 20, 2022 is 68.0.
Final Answer: 68.0
> Finished chain.
```
@@ -0,0 +1,38 @@
# Cohere
>[Cohere](https://cohere.ai/about) is a Canadian startup that provides natural language processing models
> that help companies improve human-machine interactions.
## Installation and Setup
- Install the Python SDK :
```bash
pip install cohere
```
Get a [Cohere api key](https://dashboard.cohere.ai/) and set it as an environment variable (`COHERE_API_KEY`)
## LLM
There exists an Cohere LLM wrapper, which you can access with
See a [usage example](/docs/modules/model_io/models/llms/integrations/cohere.html).
```python
from langchain.llms import Cohere
```
## Text Embedding Model
There exists an Cohere Embedding model, which you can access with
```python
from langchain.embeddings import CohereEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_connection/text_embedding/integrations/cohere.html)
## Retriever
See a [usage example](/docs/modules/data_connection/retrievers/integrations/cohere-reranker.html).
```python
from langchain.retrievers.document_compressors import CohereRerank
```
@@ -0,0 +1,16 @@
# College Confidential
>[College Confidential](https://www.collegeconfidential.com/) gives information on 3,800+ colleges and universities.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/college_confidential.html).
```python
from langchain.document_loaders import CollegeConfidentialLoader
```
@@ -0,0 +1,348 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Comet"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![](https://user-images.githubusercontent.com/7529846/230328046-a8b18c51-12e3-4617-9b39-97614a571a2d.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this guide we will demonstrate how to track your Langchain Experiments, Evaluation Metrics, and LLM Sessions with [Comet](https://www.comet.com/site/?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook). \n",
"\n",
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/hwchase17/langchain/blob/master/docs/ecosystem/comet_tracking.html\">\n",
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
"</a>\n",
"\n",
"**Example Project:** [Comet with LangChain](https://www.comet.com/examples/comet-example-langchain/view/b5ZThK6OFdhKWVSP3fDfRtrNF/panels?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![](https://user-images.githubusercontent.com/7529846/230326720-a9711435-9c6f-4edb-a707-94b67271ab25.png)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install Comet and Dependencies"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install comet_ml langchain openai google-search-results spacy textstat pandas\n",
"\n",
"import sys\n",
"\n",
"!{sys.executable} -m spacy download en_core_web_sm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize Comet and Set your Credentials"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can grab your [Comet API Key here](https://www.comet.com/signup?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook) or click the link after initializing Comet"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import comet_ml\n",
"\n",
"comet_ml.init(project_name=\"comet-example-langchain\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set OpenAI and SerpAPI credentials"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You will need an [OpenAI API Key](https://platform.openai.com/account/api-keys) and a [SerpAPI API Key](https://serpapi.com/dashboard) to run the following examples"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"...\"\n",
"# os.environ[\"OPENAI_ORGANIZATION\"] = \"...\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"...\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 1: Using just an LLM"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime\n",
"\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.llms import OpenAI\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" project_name=\"comet-example-langchain\",\n",
" complexity_metrics=True,\n",
" stream_logs=True,\n",
" tags=[\"llm\"],\n",
" visualizations=[\"dep\"],\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks, verbose=True)\n",
"\n",
"llm_result = llm.generate([\"Tell me a joke\", \"Tell me a poem\", \"Tell me a fact\"] * 3)\n",
"print(\"LLM result\", llm_result)\n",
"comet_callback.flush_tracker(llm, finish=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 2: Using an LLM in a Chain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.prompts import PromptTemplate\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" complexity_metrics=True,\n",
" project_name=\"comet-example-langchain\",\n",
" stream_logs=True,\n",
" tags=[\"synopsis-chain\"],\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks)\n",
"\n",
"template = \"\"\"You are a playwright. Given the title of play, it is your job to write a synopsis for that title.\n",
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)\n",
"\n",
"test_prompts = [{\"title\": \"Documentary about Bigfoot in Paris\"}]\n",
"print(synopsis_chain.apply(test_prompts))\n",
"comet_callback.flush_tracker(synopsis_chain, finish=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 3: Using An Agent with Tools "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.llms import OpenAI\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" project_name=\"comet-example-langchain\",\n",
" complexity_metrics=True,\n",
" stream_logs=True,\n",
" tags=[\"agent\"],\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks)\n",
"\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=\"zero-shot-react-description\",\n",
" callbacks=callbacks,\n",
" verbose=True,\n",
")\n",
"agent.run(\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
")\n",
"comet_callback.flush_tracker(agent, finish=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 4: Using Custom Evaluation Metrics"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `CometCallbackManager` also allows you to define and use Custom Evaluation Metrics to assess generated outputs from your model. Let's take a look at how this works. \n",
"\n",
"\n",
"In the snippet below, we will use the [ROUGE](https://huggingface.co/spaces/evaluate-metric/rouge) metric to evaluate the quality of a generated summary of an input prompt. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install rouge-score"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from rouge_score import rouge_scorer\n",
"\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.prompts import PromptTemplate\n",
"\n",
"\n",
"class Rouge:\n",
" def __init__(self, reference):\n",
" self.reference = reference\n",
" self.scorer = rouge_scorer.RougeScorer([\"rougeLsum\"], use_stemmer=True)\n",
"\n",
" def compute_metric(self, generation, prompt_idx, gen_idx):\n",
" prediction = generation.text\n",
" results = self.scorer.score(target=self.reference, prediction=prediction)\n",
"\n",
" return {\n",
" \"rougeLsum_score\": results[\"rougeLsum\"].fmeasure,\n",
" \"reference\": self.reference,\n",
" }\n",
"\n",
"\n",
"reference = \"\"\"\n",
"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building.\n",
"It was the first structure to reach a height of 300 metres.\n",
"\n",
"It is now taller than the Chrysler Building in New York City by 5.2 metres (17 ft)\n",
"Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France .\n",
"\"\"\"\n",
"rouge_score = Rouge(reference=reference)\n",
"\n",
"template = \"\"\"Given the following article, it is your job to write a summary.\n",
"Article:\n",
"{article}\n",
"Summary: This is the summary for the above article:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"article\"], template=template)\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" project_name=\"comet-example-langchain\",\n",
" complexity_metrics=False,\n",
" stream_logs=True,\n",
" tags=[\"custom_metrics\"],\n",
" custom_metrics=rouge_score.compute_metric,\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9)\n",
"\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template)\n",
"\n",
"test_prompts = [\n",
" {\n",
" \"article\": \"\"\"\n",
" The tower is 324 metres (1,063 ft) tall, about the same height as\n",
" an 81-storey building, and the tallest structure in Paris. Its base is square,\n",
" measuring 125 metres (410 ft) on each side.\n",
" During its construction, the Eiffel Tower surpassed the\n",
" Washington Monument to become the tallest man-made structure in the world,\n",
" a title it held for 41 years until the Chrysler Building\n",
" in New York City was finished in 1930.\n",
"\n",
" It was the first structure to reach a height of 300 metres.\n",
" Due to the addition of a broadcasting aerial at the top of the tower in 1957,\n",
" it is now taller than the Chrysler Building by 5.2 metres (17 ft).\n",
"\n",
" Excluding transmitters, the Eiffel Tower is the second tallest\n",
" free-standing structure in France after the Millau Viaduct.\n",
" \"\"\"\n",
" }\n",
"]\n",
"print(synopsis_chain.apply(test_prompts, callbacks=callbacks))\n",
"comet_callback.flush_tracker(synopsis_chain, finish=True)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
@@ -0,0 +1,22 @@
# Confluence
>[Confluence](https://www.atlassian.com/software/confluence) is a wiki collaboration platform that saves and organizes all of the project-related material. `Confluence` is a knowledge base that primarily handles content management activities.
## Installation and Setup
```bash
pip install atlassian-python-api
```
We need to set up `username/api_key` or `Oauth2 login`.
See [instructions](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/).
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/confluence.html).
```python
from langchain.document_loaders import ConfluenceLoader
```
@@ -0,0 +1,57 @@
# C Transformers
This page covers how to use the [C Transformers](https://github.com/marella/ctransformers) library within LangChain.
It is broken into two parts: installation and setup, and then references to specific C Transformers wrappers.
## Installation and Setup
- Install the Python package with `pip install ctransformers`
- Download a supported [GGML model](https://huggingface.co/TheBloke) (see [Supported Models](https://github.com/marella/ctransformers#supported-models))
## Wrappers
### LLM
There exists a CTransformers LLM wrapper, which you can access with:
```python
from langchain.llms import CTransformers
```
It provides a unified interface for all models:
```python
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
print(llm('AI is going to'))
```
If you are getting `illegal instruction` error, try using `lib='avx'` or `lib='basic'`:
```py
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
```
It can be used with models hosted on the Hugging Face Hub:
```py
llm = CTransformers(model='marella/gpt-2-ggml')
```
If a model repo has multiple model files (`.bin` files), specify a model file using:
```py
llm = CTransformers(model='marella/gpt-2-ggml', model_file='ggml-model.bin')
```
Additional parameters can be passed using the `config` parameter:
```py
config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}
llm = CTransformers(model='marella/gpt-2-ggml', config=config)
```
See [Documentation](https://github.com/marella/ctransformers#config) for a list of available parameters.
For a more detailed walkthrough of this, see [this notebook](/docs/modules/model_io/models/llms/integrations/ctransformers.html).
@@ -0,0 +1,273 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "707d13a7",
"metadata": {},
"source": [
"# Databricks\n",
"\n",
"This notebook covers how to connect to the [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the SQLDatabase wrapper of LangChain.\n",
"It is broken into 3 parts: installation and setup, connecting to Databricks, and examples."
]
},
{
"cell_type": "markdown",
"id": "0076d072",
"metadata": {},
"source": [
"## Installation and Setup"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "739b489b",
"metadata": {},
"outputs": [],
"source": [
"!pip install databricks-sql-connector"
]
},
{
"cell_type": "markdown",
"id": "73113163",
"metadata": {},
"source": [
"## Connecting to Databricks\n",
"\n",
"You can connect to [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the `SQLDatabase.from_databricks()` method.\n",
"\n",
"### Syntax\n",
"```python\n",
"SQLDatabase.from_databricks(\n",
" catalog: str,\n",
" schema: str,\n",
" host: Optional[str] = None,\n",
" api_token: Optional[str] = None,\n",
" warehouse_id: Optional[str] = None,\n",
" cluster_id: Optional[str] = None,\n",
" engine_args: Optional[dict] = None,\n",
" **kwargs: Any)\n",
"```\n",
"### Required Parameters\n",
"* `catalog`: The catalog name in the Databricks database.\n",
"* `schema`: The schema name in the catalog.\n",
"\n",
"### Optional Parameters\n",
"There following parameters are optional. When executing the method in a Databricks notebook, you don't need to provide them in most of the cases.\n",
"* `host`: The Databricks workspace hostname, excluding 'https://' part. Defaults to 'DATABRICKS_HOST' environment variable or current workspace if in a Databricks notebook.\n",
"* `api_token`: The Databricks personal access token for accessing the Databricks SQL warehouse or the cluster. Defaults to 'DATABRICKS_TOKEN' environment variable or a temporary one is generated if in a Databricks notebook.\n",
"* `warehouse_id`: The warehouse ID in the Databricks SQL.\n",
"* `cluster_id`: The cluster ID in the Databricks Runtime. If running in a Databricks notebook and both 'warehouse_id' and 'cluster_id' are None, it uses the ID of the cluster the notebook is attached to.\n",
"* `engine_args`: The arguments to be used when connecting Databricks.\n",
"* `**kwargs`: Additional keyword arguments for the `SQLDatabase.from_uri` method."
]
},
{
"cell_type": "markdown",
"id": "b11c7e48",
"metadata": {},
"source": [
"## Examples"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "8102bca0",
"metadata": {},
"outputs": [],
"source": [
"# Connecting to Databricks with SQLDatabase wrapper\n",
"from langchain import SQLDatabase\n",
"\n",
"db = SQLDatabase.from_databricks(catalog=\"samples\", schema=\"nyctaxi\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "9dd36f58",
"metadata": {},
"outputs": [],
"source": [
"# Creating a OpenAI Chat LLM wrapper\n",
"from langchain.chat_models import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(temperature=0, model_name=\"gpt-4\")"
]
},
{
"cell_type": "markdown",
"id": "5b5c5f1a",
"metadata": {},
"source": [
"### SQL Chain example\n",
"\n",
"This example demonstrates the use of the [SQL Chain](https://python.langchain.com/en/latest/modules/chains/examples/sqlite.html) for answering a question over a Databricks database."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "36f2270b",
"metadata": {},
"outputs": [],
"source": [
"from langchain import SQLDatabaseChain\n",
"\n",
"db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4e2b5f25",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new SQLDatabaseChain chain...\u001b[0m\n",
"What is the average duration of taxi rides that start between midnight and 6am?\n",
"SQLQuery:\u001b[32;1m\u001b[1;3mSELECT AVG(UNIX_TIMESTAMP(tpep_dropoff_datetime) - UNIX_TIMESTAMP(tpep_pickup_datetime)) as avg_duration\n",
"FROM trips\n",
"WHERE HOUR(tpep_pickup_datetime) >= 0 AND HOUR(tpep_pickup_datetime) < 6\u001b[0m\n",
"SQLResult: \u001b[33;1m\u001b[1;3m[(987.8122786304605,)]\u001b[0m\n",
"Answer:\u001b[32;1m\u001b[1;3mThe average duration of taxi rides that start between midnight and 6am is 987.81 seconds.\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The average duration of taxi rides that start between midnight and 6am is 987.81 seconds.'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db_chain.run(\n",
" \"What is the average duration of taxi rides that start between midnight and 6am?\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "e496d5e5",
"metadata": {},
"source": [
"### SQL Database Agent example\n",
"\n",
"This example demonstrates the use of the [SQL Database Agent](/docs/modules/agents/toolkits/sql_database.html) for answering questions over a Databricks database."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "9918e86a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import create_sql_agent\n",
"from langchain.agents.agent_toolkits import SQLDatabaseToolkit\n",
"\n",
"toolkit = SQLDatabaseToolkit(db=db, llm=llm)\n",
"agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "c484a76e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mAction: list_tables_sql_db\n",
"Action Input: \u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mtrips\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mI should check the schema of the trips table to see if it has the necessary columns for trip distance and duration.\n",
"Action: schema_sql_db\n",
"Action Input: trips\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3m\n",
"CREATE TABLE trips (\n",
"\ttpep_pickup_datetime TIMESTAMP, \n",
"\ttpep_dropoff_datetime TIMESTAMP, \n",
"\ttrip_distance FLOAT, \n",
"\tfare_amount FLOAT, \n",
"\tpickup_zip INT, \n",
"\tdropoff_zip INT\n",
") USING DELTA\n",
"\n",
"/*\n",
"3 rows from trips table:\n",
"tpep_pickup_datetime\ttpep_dropoff_datetime\ttrip_distance\tfare_amount\tpickup_zip\tdropoff_zip\n",
"2016-02-14 16:52:13+00:00\t2016-02-14 17:16:04+00:00\t4.94\t19.0\t10282\t10171\n",
"2016-02-04 18:44:19+00:00\t2016-02-04 18:46:00+00:00\t0.28\t3.5\t10110\t10110\n",
"2016-02-17 17:13:57+00:00\t2016-02-17 17:17:55+00:00\t0.7\t5.0\t10103\t10023\n",
"*/\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe trips table has the necessary columns for trip distance and duration. I will write a query to find the longest trip distance and its duration.\n",
"Action: query_checker_sql_db\n",
"Action Input: SELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001b[0m\n",
"Observation: \u001b[31;1m\u001b[1;3mSELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe query is correct. I will now execute it to find the longest trip distance and its duration.\n",
"Action: query_sql_db\n",
"Action Input: SELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m[(30.6, '0 00:43:31.000000000')]\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mI now know the final answer.\n",
"Final Answer: The longest trip distance is 30.6 miles and it took 43 minutes and 31 seconds.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The longest trip distance is 30.6 miles and it took 43 minutes and 31 seconds.'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"What is the longest trip distance and how long did it take?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,42 @@
Databricks
==========
The [Databricks](https://www.databricks.com/) Lakehouse Platform unifies data, analytics, and AI on one platform.
Databricks embraces the LangChain ecosystem in various ways:
1. Databricks connector for the SQLDatabase Chain: SQLDatabase.from_databricks() provides an easy way to query your data on Databricks through LangChain
2. Databricks MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps
3. Databricks MLflow AI Gateway
4. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks
5. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub
Databricks connector for the SQLDatabase Chain
----------------------------------------------
You can connect to [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the SQLDatabase wrapper of LangChain. See the notebook [Connect to Databricks](/docs/ecosystem/integrations/databricks/databricks.html) for details.
Databricks MLflow integrates with LangChain
-------------------------------------------
MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. See the notebook [MLflow Callback Handler](/docs/ecosystem/integrations/mlflow_tracking.ipynb) for details about MLflow's integration with LangChain.
Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Databricks workspace features such as experiment and run management and notebook revision capture. MLflow on Databricks offers an integrated experience for tracking and securing machine learning model training runs and running machine learning projects. See [MLflow guide](https://docs.databricks.com/mlflow/index.html) for more details.
Databricks MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving.
Databricks MLflow AI Gateway
----------------------------
See [MLflow AI Gateway](/docs/ecosystem/integrations/mlflow_ai_gateway).
Databricks as an LLM provider
-----------------------------
The notebook [Wrap Databricks endpoints as LLMs](/docs/modules/model_io/models/llms/integrations/databricks.html) illustrates the method to wrap Databricks endpoints as LLMs in LangChain. It supports two types of endpoints: the serving endpoint, which is recommended for both production and development, and the cluster driver proxy app, which is recommended for interactive development.
Databricks endpoints support Dolly, but are also great for hosting models like MPT-7B or any other models from the Hugging Face ecosystem. Databricks endpoints can also be used with proprietary models like OpenAI to provide a governance layer for enterprises.
Databricks Dolly
----------------
Databricks Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. The model is available on Hugging Face Hub as databricks/dolly-v2-12b. See the notebook [Hugging Face Hub](/docs/modules/model_io/models/llms/integrations/huggingface_hub.html) for instructions to access it through the Hugging Face Hub integration with LangChain.
@@ -0,0 +1,88 @@
# Datadog Tracing
>[ddtrace](https://github.com/DataDog/dd-trace-py) is a Datadog application performance monitoring (APM) library which provides an integration to monitor your LangChain application.
Key features of the ddtrace integration for LangChain:
- Traces: Capture LangChain requests, parameters, prompt-completions, and help visualize LangChain operations.
- Metrics: Capture LangChain request latency, errors, and token/cost usage (for OpenAI LLMs and Chat Models).
- Logs: Store prompt completion data for each LangChain operation.
- Dashboard: Combine metrics, logs, and trace data into a single plane to monitor LangChain requests.
- Monitors: Provide alerts in response to spikes in LangChain request latency or error rate.
Note: The ddtrace LangChain integration currently provides tracing for LLMs, Chat Models, Text Embedding Models, Chains, and Vectorstores.
## Installation and Setup
1. Enable APM and StatsD in your Datadog Agent, along with a Datadog API key. For example, in Docker:
```
docker run -d --cgroupns host \
--pid host \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-e DD_API_KEY=<DATADOG_API_KEY> \
-p 127.0.0.1:8126:8126/tcp \
-p 127.0.0.1:8125:8125/udp \
-e DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true \
-e DD_APM_ENABLED=true \
gcr.io/datadoghq/agent:latest
```
2. Install the Datadog APM Python library.
```
pip install ddtrace>=1.17
```
3. The LangChain integration can be enabled automatically when you prefix your LangChain Python application command with `ddtrace-run`:
```
DD_SERVICE="my-service" DD_ENV="staging" DD_API_KEY=<DATADOG_API_KEY> ddtrace-run python <your-app>.py
```
**Note**: If the Agent is using a non-default hostname or port, be sure to also set `DD_AGENT_HOST`, `DD_TRACE_AGENT_PORT`, or `DD_DOGSTATSD_PORT`.
Additionally, the LangChain integration can be enabled programmatically by adding `patch_all()` or `patch(langchain=True)` before the first import of `langchain` in your application.
Note that using `ddtrace-run` or `patch_all()` will also enable the `requests` and `aiohttp` integrations which trace HTTP requests to LLM providers, as well as the `openai` integration which traces requests to the OpenAI library.
```python
from ddtrace import config, patch
# Note: be sure to configure the integration before calling ``patch()``!
# eg. config.langchain["logs_enabled"] = True
patch(langchain=True)
# to trace synchronous HTTP requests
# patch(langchain=True, requests=True)
# to trace asynchronous HTTP requests (to the OpenAI library)
# patch(langchain=True, aiohttp=True)
# to include underlying OpenAI spans from the OpenAI integration
# patch(langchain=True, openai=True)patch_all
```
See the [APM Python library documentation][https://ddtrace.readthedocs.io/en/stable/installation_quickstart.html] for more advanced usage.
## Configuration
See the [APM Python library documentation][https://ddtrace.readthedocs.io/en/stable/integrations.html#langchain] for all the available configuration options.
### Log Prompt & Completion Sampling
To enable log prompt and completion sampling, set the `DD_LANGCHAIN_LOGS_ENABLED=1` environment variable. By default, 10% of traced requests will emit logs containing the prompts and completions.
To adjust the log sample rate, see the [APM library documentation][https://ddtrace.readthedocs.io/en/stable/integrations.html#langchain].
**Note**: Logs submission requires `DD_API_KEY` to be specified when running `ddtrace-run`.
## Troubleshooting
Need help? Create an issue on [ddtrace](https://github.com/DataDog/dd-trace-py) or contact [Datadog support][https://docs.datadoghq.com/help/].
@@ -0,0 +1,19 @@
# Datadog Logs
>[Datadog](https://www.datadoghq.com/) is a monitoring and analytics platform for cloud-scale applications.
## Installation and Setup
```bash
pip install datadog_api_client
```
We must initialize the loader with the Datadog API key and APP key, and we need to set up the query to extract the desired logs.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/datadog_logs.html).
```python
from langchain.document_loaders import DatadogLogsLoader
```
@@ -0,0 +1,51 @@
# DataForSEO
This page provides instructions on how to use the DataForSEO search APIs within LangChain.
## Installation and Setup
- Get a DataForSEO API Access login and password, and set them as environment variables (`DATAFORSEO_LOGIN` and `DATAFORSEO_PASSWORD` respectively). You can find it in your dashboard.
## Wrappers
### Utility
The DataForSEO utility wraps the API. To import this utility, use:
```python
from langchain.utilities import DataForSeoAPIWrapper
```
For a detailed walkthrough of this wrapper, see [this notebook](/docs/modules/agents/tools/integrations/dataforseo.ipynb).
### Tool
You can also load this wrapper as a Tool to use with an Agent:
```python
from langchain.agents import load_tools
tools = load_tools(["dataforseo-api-search"])
```
## Example usage
```python
dataforseo = DataForSeoAPIWrapper(api_login="your_login", api_password="your_password")
result = dataforseo.run("Bill Gates")
print(result)
```
## Environment Variables
You can store your DataForSEO API Access login and password as environment variables. The wrapper will automatically check for these environment variables if no values are provided:
```python
import os
os.environ["DATAFORSEO_LOGIN"] = "your_login"
os.environ["DATAFORSEO_PASSWORD"] = "your_password"
dataforseo = DataForSeoAPIWrapper()
result = dataforseo.run("weather in Los Angeles")
print(result)
```
@@ -0,0 +1,25 @@
# DeepInfra
This page covers how to use the DeepInfra ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific DeepInfra wrappers.
## Installation and Setup
- Get your DeepInfra api key from this link [here](https://deepinfra.com/).
- Get an DeepInfra api key and set it as an environment variable (`DEEPINFRA_API_TOKEN`)
## Available Models
DeepInfra provides a range of Open Source LLMs ready for deployment.
You can list supported models [here](https://deepinfra.com/models?type=text-generation).
google/flan\* models can be viewed [here](https://deepinfra.com/models?type=text2text-generation).
You can view a list of request and response parameters [here](https://deepinfra.com/databricks/dolly-v2-12b#API)
## Wrappers
### LLM
There exists an DeepInfra LLM wrapper, which you can access with
```python
from langchain.llms import DeepInfra
```
@@ -0,0 +1,30 @@
# Deep Lake
This page covers how to use the Deep Lake ecosystem within LangChain.
## Why Deep Lake?
- More than just a (multi-modal) vector store. You can later use the dataset to fine-tune your own LLM models.
- Not only stores embeddings, but also the original data with automatic version control.
- Truly serverless. Doesn't require another service and can be used with major cloud providers (AWS S3, GCS, etc.)
## More Resources
1. [Ultimate Guide to LangChain & Deep Lake: Build ChatGPT to Answer Questions on Your Financial Data](https://www.activeloop.ai/resources/ultimate-guide-to-lang-chain-deep-lake-build-chat-gpt-to-answer-questions-on-your-financial-data/)
2. [Twitter the-algorithm codebase analysis with Deep Lake](../use_cases/code/twitter-the-algorithm-analysis-deeplake.html)
3. Here is [whitepaper](https://www.deeplake.ai/whitepaper) and [academic paper](https://arxiv.org/pdf/2209.10785.pdf) for Deep Lake
4. Here is a set of additional resources available for review: [Deep Lake](https://github.com/activeloopai/deeplake), [Get started](https://docs.activeloop.ai/getting-started) and [Tutorials](https://docs.activeloop.ai/hub-tutorials)
## Installation and Setup
- Install the Python package with `pip install deeplake`
## Wrappers
### VectorStore
There exists a wrapper around Deep Lake, a data lake for Deep Learning applications, allowing you to use it as a vector store (for now), whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import DeepLake
```
For a more detailed walkthrough of the Deep Lake wrapper, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/deeplake.html)
@@ -0,0 +1,18 @@
# Diffbot
>[Diffbot](https://docs.diffbot.com/docs) is a service to read web pages. Unlike traditional web scraping tools,
> `Diffbot` doesn't require any rules to read the content on a page.
>It starts with computer vision, which classifies a page into one of 20 possible types. Content is then interpreted by a machine learning model trained to identify the key attributes on a page based on its type.
>The result is a website transformed into clean-structured data (like JSON or CSV), ready for your application.
## Installation and Setup
Read [instructions](https://docs.diffbot.com/reference/authentication) how to get the Diffbot API Token.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/diffbot.html).
```python
from langchain.document_loaders import DiffbotLoader
```
@@ -0,0 +1,30 @@
# Discord
>[Discord](https://discord.com/) is a VoIP and instant messaging social platform. Users have the ability to communicate
> with voice calls, video calls, text messaging, media and files in private chats or as part of communities called
> "servers". A server is a collection of persistent chat rooms and voice channels which can be accessed via invite links.
## Installation and Setup
```bash
pip install pandas
```
Follow these steps to download your `Discord` data:
1. Go to your **User Settings**
2. Then go to **Privacy and Safety**
3. Head over to the **Request all of my Data** and click on **Request Data** button
It might take 30 days for you to receive your data. You'll receive an email at the address which is registered
with Discord. That email will have a download button using which you would be able to download your personal Discord data.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/discord.html).
```python
from langchain.document_loaders import DiscordChatLoader
```
@@ -0,0 +1,20 @@
# Docugami
>[Docugami](https://docugami.com) converts business documents into a Document XML Knowledge Graph, generating forests
> of XML semantic trees representing entire documents. This is a rich representation that includes the semantic and
> structural characteristics of various chunks in the document as an XML tree.
## Installation and Setup
```bash
pip install lxml
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/docugami.html).
```python
from langchain.document_loaders import DocugamiLoader
```
@@ -0,0 +1,19 @@
# DuckDB
>[DuckDB](https://duckdb.org/) is an in-process SQL OLAP database management system.
## Installation and Setup
First, you need to install `duckdb` python package.
```bash
pip install duckdb
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/duckdb.html).
```python
from langchain.document_loaders import DuckDBLoader
```
@@ -0,0 +1,24 @@
# Elasticsearch
>[Elasticsearch](https://www.elastic.co/elasticsearch/) is a distributed, RESTful search and analytics engine.
> It provides a distributed, multi-tenant-capable full-text search engine with an HTTP web interface and schema-free
> JSON documents.
## Installation and Setup
```bash
pip install elasticsearch
```
## Retriever
>In information retrieval, [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25) (BM is an abbreviation of best matching) is a ranking function used by search engines to estimate the relevance of documents to a given search query. It is based on the probabilistic retrieval framework developed in the 1970s and 1980s by Stephen E. Robertson, Karen Spärck Jones, and others.
>The name of the actual ranking function is BM25. The fuller name, Okapi BM25, includes the name of the first system to use it, which was the Okapi information retrieval system, implemented at London's City University in the 1980s and 1990s. BM25 and its newer variants, e.g. BM25F (a version of BM25 that can take document structure and anchor text into account), represent TF-IDF-like retrieval functions used in document retrieval.
See a [usage example](/docs/modules/data_connection/retrievers/integrations/elastic_search_bm25.html).
```python
from langchain.retrievers import ElasticSearchBM25Retriever
```
@@ -0,0 +1,20 @@
# EverNote
>[EverNote](https://evernote.com/) is intended for archiving and creating notes in which photos, audio and saved web content can be embedded. Notes are stored in virtual "notebooks" and can be tagged, annotated, edited, searched, and exported.
## Installation and Setup
First, you need to install `lxml` and `html2text` python packages.
```bash
pip install lxml
pip install html2text
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/evernote.html).
```python
from langchain.document_loaders import EverNoteLoader
```
@@ -0,0 +1,21 @@
# Facebook Chat
>[Messenger](https://en.wikipedia.org/wiki/Messenger_(software)) is an American proprietary instant messaging app and
> platform developed by `Meta Platforms`. Originally developed as `Facebook Chat` in 2008, the company revamped its
> messaging service in 2010.
## Installation and Setup
First, you need to install `pandas` python package.
```bash
pip install pandas
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/facebook_chat.html).
```python
from langchain.document_loaders import FacebookChatLoader
```
@@ -0,0 +1,21 @@
# Figma
>[Figma](https://www.figma.com/) is a collaborative web application for interface design.
## Installation and Setup
The Figma API requires an `access token`, `node_ids`, and a `file key`.
The `file key` can be pulled from the URL. https://www.figma.com/file/{filekey}/sampleFilename
`Node IDs` are also available in the URL. Click on anything and look for the '?node-id={node_id}' param.
`Access token` [instructions](https://help.figma.com/hc/en-us/articles/8085703771159-Manage-personal-access-tokens).
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/figma.html).
```python
from langchain.document_loaders import FigmaFileLoader
```
@@ -0,0 +1,153 @@
# Flyte
> [Flyte](https://github.com/flyteorg/flyte) is an open-source orchestrator that facilitates building production-grade data and ML pipelines.
> It is built for scalability and reproducibility, leveraging Kubernetes as its underlying platform.
The purpose of this notebook is to demonstrate the integration of a `FlyteCallback` into your Flyte task, enabling you to effectively monitor and track your LangChain experiments.
## Installation & Setup
- Install the Flytekit library by running the command `pip install flytekit`.
- Install the Flytekit-Envd plugin by running the command `pip install flytekitplugins-envd`.
- Install LangChain by running the command `pip install langchain`.
- Install [Docker](https://docs.docker.com/engine/install/) on your system.
## Flyte Tasks
A Flyte [task](https://docs.flyte.org/projects/cookbook/en/latest/auto/core/flyte_basics/task.html) serves as the foundational building block of Flyte.
To execute LangChain experiments, you need to write Flyte tasks that define the specific steps and operations involved.
NOTE: The [getting started guide](https://docs.flyte.org/projects/cookbook/en/latest/index.html) offers detailed, step-by-step instructions on installing Flyte locally and running your initial Flyte pipeline.
First, import the necessary dependencies to support your LangChain experiments.
```python
import os
from flytekit import ImageSpec, task
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.callbacks import FlyteCallbackHandler
from langchain.chains import LLMChain
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.schema import HumanMessage
```
Set up the necessary environment variables to utilize the OpenAI API and Serp API:
```python
# Set OpenAI API key
os.environ["OPENAI_API_KEY"] = "<your_openai_api_key>"
# Set Serp API key
os.environ["SERPAPI_API_KEY"] = "<your_serp_api_key>"
```
Replace `<your_openai_api_key>` and `<your_serp_api_key>` with your respective API keys obtained from OpenAI and Serp API.
To guarantee reproducibility of your pipelines, Flyte tasks are containerized.
Each Flyte task must be associated with an image, which can either be shared across the entire Flyte [workflow](https://docs.flyte.org/projects/cookbook/en/latest/auto/core/flyte_basics/basic_workflow.html) or provided separately for each task.
To streamline the process of supplying the required dependencies for each Flyte task, you can initialize an [`ImageSpec`](https://docs.flyte.org/projects/cookbook/en/latest/auto/core/image_spec/image_spec.html) object.
This approach automatically triggers a Docker build, alleviating the need for users to manually create a Docker image.
```python
custom_image = ImageSpec(
name="langchain-flyte",
packages=[
"langchain",
"openai",
"spacy",
"https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.5.0/en_core_web_sm-3.5.0.tar.gz",
"textstat",
"google-search-results",
],
registry="<your-registry>",
)
```
You have the flexibility to push the Docker image to a registry of your preference.
[Docker Hub](https://hub.docker.com/) or [GitHub Container Registry (GHCR)](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry) is a convenient option to begin with.
Once you have selected a registry, you can proceed to create Flyte tasks that log the LangChain metrics to Flyte Deck.
The following examples demonstrate tasks related to OpenAI LLM, chains and agent with tools:
### LLM
```python
@task(disable_deck=False, container_image=custom_image)
def langchain_llm() -> str:
llm = ChatOpenAI(
model_name="gpt-3.5-turbo",
temperature=0.2,
callbacks=[FlyteCallbackHandler()],
)
return llm([HumanMessage(content="Tell me a joke")]).content
```
### Chain
```python
@task(disable_deck=False, container_image=custom_image)
def langchain_chain() -> list[dict[str, str]]:
template = """You are a playwright. Given the title of play, it is your job to write a synopsis for that title.
Title: {title}
Playwright: This is a synopsis for the above play:"""
llm = ChatOpenAI(
model_name="gpt-3.5-turbo",
temperature=0,
callbacks=[FlyteCallbackHandler()],
)
prompt_template = PromptTemplate(input_variables=["title"], template=template)
synopsis_chain = LLMChain(
llm=llm, prompt=prompt_template, callbacks=[FlyteCallbackHandler()]
)
test_prompts = [
{
"title": "documentary about good video games that push the boundary of game design"
},
]
return synopsis_chain.apply(test_prompts)
```
### Agent
```python
@task(disable_deck=False, container_image=custom_image)
def langchain_agent() -> str:
llm = OpenAI(
model_name="gpt-3.5-turbo",
temperature=0,
callbacks=[FlyteCallbackHandler()],
)
tools = load_tools(
["serpapi", "llm-math"], llm=llm, callbacks=[FlyteCallbackHandler()]
)
agent = initialize_agent(
tools,
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
callbacks=[FlyteCallbackHandler()],
verbose=True,
)
return agent.run(
"Who is Leonardo DiCaprio's girlfriend? Could you calculate her current age and raise it to the power of 0.43?"
)
```
These tasks serve as a starting point for running your LangChain experiments within Flyte.
## Execute the Flyte Tasks on Kubernetes
To execute the Flyte tasks on the configured Flyte backend, use the following command:
```bash
pyflyte run --image <your-image> langchain_flyte.py langchain_llm
```
This command will initiate the execution of the `langchain_llm` task on the Flyte backend. You can trigger the remaining two tasks in a similar manner.
The metrics will be displayed on the Flyte UI as follows:
![LangChain LLM](https://ik.imagekit.io/c8zl7irwkdda/Screenshot_2023-06-20_at_1.23.29_PM_MZYeG0dKa.png?updatedAt=1687247642993)
@@ -0,0 +1,16 @@
# ForefrontAI
This page covers how to use the ForefrontAI ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific ForefrontAI wrappers.
## Installation and Setup
- Get an ForefrontAI api key and set it as an environment variable (`FOREFRONTAI_API_KEY`)
## Wrappers
### LLM
There exists an ForefrontAI LLM wrapper, which you can access with
```python
from langchain.llms import ForefrontAI
```
@@ -0,0 +1,19 @@
# Git
>[Git](https://en.wikipedia.org/wiki/Git) is a distributed version control system that tracks changes in any set of computer files, usually used for coordinating work among programmers collaboratively developing source code during software development.
## Installation and Setup
First, you need to install `GitPython` python package.
```bash
pip install GitPython
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/git.html).
```python
from langchain.document_loaders import GitLoader
```
@@ -0,0 +1,15 @@
# GitBook
>[GitBook](https://docs.gitbook.com/) is a modern documentation platform where teams can document everything from products to internal knowledge bases and APIs.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/gitbook.html).
```python
from langchain.document_loaders import GitbookLoader
```
@@ -0,0 +1,34 @@
# Golden
>[Golden](https://golden.com) provides a set of natural language APIs for querying and enrichment using the Golden Knowledge Graph e.g. queries such as: `Products from OpenAI`, `Generative ai companies with series a funding`, and `rappers who invest` can be used to retrieve relevant structured data about relevant entities.
>
>The `golden-query` langchain tool is a wrapper on top of the [Golden Query API](https://docs.golden.com/reference/query-api) which enables programmatic access to these results.
>See the [Golden Query API docs](https://docs.golden.com/reference/query-api) for more information.
## Installation and Setup
- Go to the [Golden API docs](https://docs.golden.com/) to get an overview about the Golden API.
- Get your API key from the [Golden API Settings](https://golden.com/settings/api) page.
- Save your API key into GOLDEN_API_KEY env variable
## Wrappers
### Utility
There exists a GoldenQueryAPIWrapper utility which wraps this API. To import this utility:
```python
from langchain.utilities.golden_query import GoldenQueryAPIWrapper
```
For a more detailed walkthrough of this wrapper, see [this notebook](/docs/modules/agents/tools/integrations/golden_query.html).
### Tool
You can also easily load this wrapper as a Tool (to use with an Agent).
You can do this with:
```python
from langchain.agents import load_tools
tools = load_tools(["golden-query"])
```
For more information on tools, see [this page](/docs/modules/agents/tools/).
@@ -0,0 +1,20 @@
# Google BigQuery
>[Google BigQuery](https://cloud.google.com/bigquery) is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.
`BigQuery` is a part of the `Google Cloud Platform`.
## Installation and Setup
First, you need to install `google-cloud-bigquery` python package.
```bash
pip install google-cloud-bigquery
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/google_bigquery.html).
```python
from langchain.document_loaders import BigQueryLoader
```
@@ -0,0 +1,26 @@
# Google Cloud Storage
>[Google Cloud Storage](https://en.wikipedia.org/wiki/Google_Cloud_Storage) is a managed service for storing unstructured data.
## Installation and Setup
First, you need to install `google-cloud-bigquery` python package.
```bash
pip install google-cloud-storage
```
## Document Loader
There are two loaders for the `Google Cloud Storage`: the `Directory` and the `File` loaders.
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/google_cloud_storage_directory.html).
```python
from langchain.document_loaders import GCSDirectoryLoader
```
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/google_cloud_storage_file.html).
```python
from langchain.document_loaders import GCSFileLoader
```
@@ -0,0 +1,22 @@
# Google Drive
>[Google Drive](https://en.wikipedia.org/wiki/Google_Drive) is a file storage and synchronization service developed by Google.
Currently, only `Google Docs` are supported.
## Installation and Setup
First, you need to install several python package.
```bash
pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib
```
## Document Loader
See a [usage example and authorizing instructions](/docs/modules/data_connection/document_loaders/integrations/google_drive.html).
```python
from langchain.document_loaders import GoogleDriveLoader
```
@@ -0,0 +1,32 @@
# Google Search
This page covers how to use the Google Search API within LangChain.
It is broken into two parts: installation and setup, and then references to the specific Google Search wrapper.
## Installation and Setup
- Install requirements with `pip install google-api-python-client`
- Set up a Custom Search Engine, following [these instructions](https://stackoverflow.com/questions/37083058/programmatically-searching-google-in-python-using-custom-search)
- Get an API Key and Custom Search Engine ID from the previous step, and set them as environment variables `GOOGLE_API_KEY` and `GOOGLE_CSE_ID` respectively
## Wrappers
### Utility
There exists a GoogleSearchAPIWrapper utility which wraps this API. To import this utility:
```python
from langchain.utilities import GoogleSearchAPIWrapper
```
For a more detailed walkthrough of this wrapper, see [this notebook](/docs/modules/agents/tools/integrations/google_search.html).
### Tool
You can also easily load this wrapper as a Tool (to use with an Agent).
You can do this with:
```python
from langchain.agents import load_tools
tools = load_tools(["google-search"])
```
For more information on tools, see [this page](/docs/modules/agents/tools/).
@@ -0,0 +1,73 @@
# Google Serper
This page covers how to use the [Serper](https://serper.dev) Google Search API within LangChain. Serper is a low-cost Google Search API that can be used to add answer box, knowledge graph, and organic results data from Google Search.
It is broken into two parts: setup, and then references to the specific Google Serper wrapper.
## Setup
- Go to [serper.dev](https://serper.dev) to sign up for a free account
- Get the api key and set it as an environment variable (`SERPER_API_KEY`)
## Wrappers
### Utility
There exists a GoogleSerperAPIWrapper utility which wraps this API. To import this utility:
```python
from langchain.utilities import GoogleSerperAPIWrapper
```
You can use it as part of a Self Ask chain:
```python
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.llms.openai import OpenAI
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
import os
os.environ["SERPER_API_KEY"] = ""
os.environ['OPENAI_API_KEY'] = ""
llm = OpenAI(temperature=0)
search = GoogleSerperAPIWrapper()
tools = [
Tool(
name="Intermediate Answer",
func=search.run,
description="useful for when you need to ask with search"
)
]
self_ask_with_search = initialize_agent(tools, llm, agent=AgentType.SELF_ASK_WITH_SEARCH, verbose=True)
self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")
```
#### Output
```
Entering new AgentExecutor chain...
Yes.
Follow up: Who is the reigning men's U.S. Open champion?
Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion.
Follow up: Where is Carlos Alcaraz from?
Intermediate answer: El Palmar, Spain
So the final answer is: El Palmar, Spain
> Finished chain.
'El Palmar, Spain'
```
For a more detailed walkthrough of this wrapper, see [this notebook](/docs/modules/agents/tools/integrations/google_serper.html).
### Tool
You can also easily load this wrapper as a Tool (to use with an Agent).
You can do this with:
```python
from langchain.agents import load_tools
tools = load_tools(["google-serper"])
```
For more information on tools, see [this page](/docs/modules/agents/tools/).
@@ -0,0 +1,23 @@
# GooseAI
This page covers how to use the GooseAI ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific GooseAI wrappers.
## Installation and Setup
- Install the Python SDK with `pip install openai`
- Get your GooseAI api key from this link [here](https://goose.ai/).
- Set the environment variable (`GOOSEAI_API_KEY`).
```python
import os
os.environ["GOOSEAI_API_KEY"] = "YOUR_API_KEY"
```
## Wrappers
### LLM
There exists an GooseAI LLM wrapper, which you can access with:
```python
from langchain.llms import GooseAI
```
@@ -0,0 +1,48 @@
# GPT4All
This page covers how to use the `GPT4All` wrapper within LangChain. The tutorial is divided into two parts: installation and setup, followed by usage with an example.
## Installation and Setup
- Install the Python package with `pip install pyllamacpp`
- Download a [GPT4All model](https://github.com/nomic-ai/pyllamacpp#supported-model) and place it in your desired directory
## Usage
### GPT4All
To use the GPT4All wrapper, you need to provide the path to the pre-trained model file and the model's configuration.
```python
from langchain.llms import GPT4All
# Instantiate the model. Callbacks support token-wise streaming
model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8)
# Generate text
response = model("Once upon a time, ")
```
You can also customize the generation parameters, such as n_predict, temp, top_p, top_k, and others.
To stream the model's predictions, add in a CallbackManager.
```python
from langchain.llms import GPT4All
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# There are many CallbackHandlers supported, such as
# from langchain.callbacks.streamlit import StreamlitCallbackHandler
callbacks = [StreamingStdOutCallbackHandler()]
model = GPT4All(model="./models/gpt4all-model.bin", n_ctx=512, n_threads=8)
# Generate text. Tokens are streamed through the callback manager.
model("Once upon a time, ", callbacks=callbacks)
```
## Model File
You can find links to model file downloads in the [pyllamacpp](https://github.com/nomic-ai/pyllamacpp) repository.
For a more detailed walkthrough of this, see [this notebook](/docs/modules/model_io/models/llms/integrations/gpt4all.html)
@@ -0,0 +1,44 @@
# Graphsignal
This page covers how to use [Graphsignal](https://app.graphsignal.com) to trace and monitor LangChain. Graphsignal enables full visibility into your application. It provides latency breakdowns by chains and tools, exceptions with full context, data monitoring, compute/GPU utilization, OpenAI cost analytics, and more.
## Installation and Setup
- Install the Python library with `pip install graphsignal`
- Create free Graphsignal account [here](https://graphsignal.com)
- Get an API key and set it as an environment variable (`GRAPHSIGNAL_API_KEY`)
## Tracing and Monitoring
Graphsignal automatically instruments and starts tracing and monitoring chains. Traces and metrics are then available in your [Graphsignal dashboards](https://app.graphsignal.com).
Initialize the tracer by providing a deployment name:
```python
import graphsignal
graphsignal.configure(deployment='my-langchain-app-prod')
```
To additionally trace any function or code, you can use a decorator or a context manager:
```python
@graphsignal.trace_function
def handle_request():
chain.run("some initial text")
```
```python
with graphsignal.start_trace('my-chain'):
chain.run("some initial text")
```
Optionally, enable profiling to record function-level statistics for each trace.
```python
with graphsignal.start_trace(
'my-chain', options=graphsignal.TraceOptions(enable_profiling=True)):
chain.run("some initial text")
```
See the [Quick Start](https://graphsignal.com/docs/guides/quick-start/) guide for complete setup instructions.
@@ -0,0 +1,44 @@
# Grobid
This page covers how to use the Grobid to parse articles for LangChain.
It is separated into two parts: installation and running the server
## Installation and Setup
#Ensure You have Java installed
!apt-get install -y openjdk-11-jdk -q
!update-alternatives --set java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
#Clone and install the Grobid Repo
import os
!git clone https://github.com/kermitt2/grobid.git
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-11-openjdk-amd64"
os.chdir('grobid')
!./gradlew clean install
#Run the server,
get_ipython().system_raw('nohup ./gradlew run > grobid.log 2>&1 &')
You can now use the GrobidParser to produce documents
```python
from langchain.document_loaders.parsers import GrobidParser
from langchain.document_loaders.generic import GenericLoader
#Produce chunks from article paragraphs
loader = GenericLoader.from_filesystem(
"/Users/31treehaus/Desktop/Papers/",
glob="*",
suffixes=[".pdf"],
parser= GrobidParser(segment_sentences=False)
)
docs = loader.load()
#Produce chunks from article sentences
loader = GenericLoader.from_filesystem(
"/Users/31treehaus/Desktop/Papers/",
glob="*",
suffixes=[".pdf"],
parser= GrobidParser(segment_sentences=True)
)
docs = loader.load()
```
Chunk metadata will include bboxes although these are a bit funky to parse, see https://grobid.readthedocs.io/en/latest/Coordinates-in-PDF/
@@ -0,0 +1,15 @@
# Gutenberg
>[Project Gutenberg](https://www.gutenberg.org/about/) is an online library of free eBooks.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/gutenberg.html).
```python
from langchain.document_loaders import GutenbergLoader
```
@@ -0,0 +1,18 @@
# Hacker News
>[Hacker News](https://en.wikipedia.org/wiki/Hacker_News) (sometimes abbreviated as `HN`) is a social news
> website focusing on computer science and entrepreneurship. It is run by the investment fund and startup
> incubator `Y Combinator`. In general, content that can be submitted is defined as "anything that gratifies
> one's intellectual curiosity."
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/hacker_news.html).
```python
from langchain.document_loaders import HNLoader
```
@@ -0,0 +1,19 @@
# Hazy Research
This page covers how to use the Hazy Research ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Hazy Research wrappers.
## Installation and Setup
- To use the `manifest`, install it with `pip install manifest-ml`
## Wrappers
### LLM
There exists an LLM wrapper around Hazy Research's `manifest` library.
`manifest` is a python library which is itself a wrapper around many model providers, and adds in caching, history, and more.
To use this wrapper:
```python
from langchain.llms.manifest import ManifestWrapper
```
@@ -0,0 +1,53 @@
# Helicone
This page covers how to use the [Helicone](https://helicone.ai) ecosystem within LangChain.
## What is Helicone?
Helicone is an [open source](https://github.com/Helicone/helicone) observability platform that proxies your OpenAI traffic and provides you key insights into your spend, latency and usage.
![Helicone](/img/HeliconeDashboard.png)
## Quick start
With your LangChain environment you can just add the following parameter.
```bash
export OPENAI_API_BASE="https://oai.hconeai.com/v1"
```
Now head over to [helicone.ai](https://helicone.ai/onboarding?step=2) to create your account, and add your OpenAI API key within our dashboard to view your logs.
![Helicone](/img/HeliconeKeys.png)
## How to enable Helicone caching
```python
from langchain.llms import OpenAI
import openai
openai.api_base = "https://oai.hconeai.com/v1"
llm = OpenAI(temperature=0.9, headers={"Helicone-Cache-Enabled": "true"})
text = "What is a helicone?"
print(llm(text))
```
[Helicone caching docs](https://docs.helicone.ai/advanced-usage/caching)
## How to use Helicone custom properties
```python
from langchain.llms import OpenAI
import openai
openai.api_base = "https://oai.hconeai.com/v1"
llm = OpenAI(temperature=0.9, headers={
"Helicone-Property-Session": "24",
"Helicone-Property-Conversation": "support_issue_2",
"Helicone-Property-App": "mobile",
})
text = "What is a helicone?"
print(llm(text))
```
[Helicone property docs](https://docs.helicone.ai/advanced-usage/custom-properties)
@@ -0,0 +1,23 @@
# Hologres
>[Hologres](https://www.alibabacloud.com/help/en/hologres/latest/introduction) is a unified real-time data warehousing service developed by Alibaba Cloud. You can use Hologres to write, update, process, and analyze large amounts of data in real time.
>`Hologres` supports standard `SQL` syntax, is compatible with `PostgreSQL`, and supports most PostgreSQL functions. Hologres supports online analytical processing (OLAP) and ad hoc analysis for up to petabytes of data, and provides high-concurrency and low-latency online data services.
>`Hologres` provides **vector database** functionality by adopting [Proxima](https://www.alibabacloud.com/help/en/hologres/latest/vector-processing).
>`Proxima` is a high-performance software library developed by `Alibaba DAMO Academy`. It allows you to search for the nearest neighbors of vectors. Proxima provides higher stability and performance than similar open source software such as Faiss. Proxima allows you to search for similar text or image embeddings with high throughput and low latency. Hologres is deeply integrated with Proxima to provide a high-performance vector search service.
## Installation and Setup
Click [here](https://www.alibabacloud.com/zh/product/hologres) to fast deploy a Hologres cloud instance.
```bash
pip install psycopg2
```
## Vector Store
See a [usage example](/docs/modules/data_connection/vectorstores/integrations/hologres.html).
```python
from langchain.vectorstores import Hologres
```
@@ -0,0 +1,69 @@
# Hugging Face
This page covers how to use the Hugging Face ecosystem (including the [Hugging Face Hub](https://huggingface.co)) within LangChain.
It is broken into two parts: installation and setup, and then references to specific Hugging Face wrappers.
## Installation and Setup
If you want to work with the Hugging Face Hub:
- Install the Hub client library with `pip install huggingface_hub`
- Create a Hugging Face account (it's free!)
- Create an [access token](https://huggingface.co/docs/hub/security-tokens) and set it as an environment variable (`HUGGINGFACEHUB_API_TOKEN`)
If you want work with the Hugging Face Python libraries:
- Install `pip install transformers` for working with models and tokenizers
- Install `pip install datasets` for working with datasets
## Wrappers
### LLM
There exists two Hugging Face LLM wrappers, one for a local pipeline and one for a model hosted on Hugging Face Hub.
Note that these wrappers only work for models that support the following tasks: [`text2text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text2text-generation&sort=downloads), [`text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text-classification&sort=downloads)
To use the local pipeline wrapper:
```python
from langchain.llms import HuggingFacePipeline
```
To use a the wrapper for a model hosted on Hugging Face Hub:
```python
from langchain.llms import HuggingFaceHub
```
For a more detailed walkthrough of the Hugging Face Hub wrapper, see [this notebook](/docs/modules/model_io/models/llms/integrations/huggingface_hub.html)
### Embeddings
There exists two Hugging Face Embeddings wrappers, one for a local model and one for a model hosted on Hugging Face Hub.
Note that these wrappers only work for [`sentence-transformers` models](https://huggingface.co/models?library=sentence-transformers&sort=downloads).
To use the local pipeline wrapper:
```python
from langchain.embeddings import HuggingFaceEmbeddings
```
To use a the wrapper for a model hosted on Hugging Face Hub:
```python
from langchain.embeddings import HuggingFaceHubEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_connection/text_embedding/integrations/huggingfacehub.html)
### Tokenizer
There are several places you can use tokenizers available through the `transformers` package.
By default, it is used to count tokens for all LLMs.
You can also use it to count tokens when splitting documents with
```python
from langchain.text_splitter import CharacterTextSplitter
CharacterTextSplitter.from_huggingface_tokenizer(...)
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_connection/document_transformers/text_splitters/huggingface_length_function.html)
### Datasets
The Hugging Face Hub has lots of great [datasets](https://huggingface.co/datasets) that can be used to evaluate your LLM chains.
For a detailed walkthrough of how to use them to do so, see [this notebook](/docs/use_cases/evaluation/huggingface_datasets.html)
@@ -0,0 +1,16 @@
# iFixit
>[iFixit](https://www.ifixit.com) is the largest, open repair community on the web. The site contains nearly 100k
> repair manuals, 200k Questions & Answers on 42k devices, and all the data is licensed under `CC-BY-NC-SA 3.0`.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/ifixit.html).
```python
from langchain.document_loaders import IFixitLoader
```
@@ -0,0 +1,16 @@
# IMSDb
>[IMSDb](https://imsdb.com/) is the `Internet Movie Script Database`.
>
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/imsdb.html).
```python
from langchain.document_loaders import IMSDbLoader
```
@@ -0,0 +1,9 @@
---
sidebar_position: 1
---
# Grouped by provider
import DocCardList from "@theme/DocCardList";
<DocCardList />
@@ -0,0 +1,35 @@
# Infino
>[Infino](https://github.com/infinohq/infino) is an open-source observability platform that stores both metrics and application logs together.
Key features of infino include:
- Metrics Tracking: Capture time taken by LLM model to handle request, errors, number of tokens, and costing indication for the particular LLM.
- Data Tracking: Log and store prompt, request, and response data for each LangChain interaction.
- Graph Visualization: Generate basic graphs over time, depicting metrics such as request duration, error occurrences, token count, and cost.
## Installation and Setup
First, you'll need to install the `infinopy` Python package as follows:
```bash
pip install infinopy
```
If you already have an Infino Server running, then you're good to go; but if
you don't, follow the next steps to start it:
- Make sure you have Docker installed
- Run the following in your terminal:
```
docker run --rm --detach --name infino-example -p 3000:3000 infinohq/infino:latest
```
## Using Infino
See a [usage example of `InfinoCallbackHandler`](/docs/modules/callbacks/integrations/infino.html).
```python
from langchain.callbacks import InfinoCallbackHandler
```
@@ -0,0 +1,74 @@
# Jina
This page covers how to use the Jina ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Jina wrappers.
## Installation and Setup
- Install the Python SDK with `pip install jina`
- Get a Jina AI Cloud auth token from [here](https://cloud.jina.ai/settings/tokens) and set it as an environment variable (`JINA_AUTH_TOKEN`)
## Wrappers
### Embeddings
There exists a Jina Embeddings wrapper, which you can access with
```python
from langchain.embeddings import JinaEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_connection/text_embedding/integrations/jina.html)
## Deployment
[Langchain-serve](https://github.com/jina-ai/langchain-serve), powered by Jina, helps take LangChain apps to production with easy to use REST/WebSocket APIs and Slack bots.
### Usage
Install the package from PyPI.
```bash
pip install langchain-serve
```
Wrap your LangChain app with the `@serving` decorator.
```python
# app.py
from lcserve import serving
@serving
def ask(input: str) -> str:
from langchain import LLMChain, OpenAI
from langchain.agents import AgentExecutor, ZeroShotAgent
tools = [...] # list of tools
prompt = ZeroShotAgent.create_prompt(
tools, input_variables=["input", "agent_scratchpad"],
)
llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)
agent = ZeroShotAgent(
llm_chain=llm_chain, allowed_tools=[tool.name for tool in tools]
)
agent_executor = AgentExecutor.from_agent_and_tools(
agent=agent,
tools=tools,
verbose=True,
)
return agent_executor.run(input)
```
Deploy on Jina AI Cloud with `lc-serve deploy jcloud app`. Once deployed, we can send a POST request to the API endpoint to get a response.
```bash
curl -X 'POST' 'https://<your-app>.wolf.jina.ai/ask' \
-d '{
"input": "Your Quesion here?",
"envs": {
"OPENAI_API_KEY": "sk-***"
}
}'
```
You can also self-host the app on your infrastructure with Docker-compose or Kubernetes. See [here](https://github.com/jina-ai/langchain-serve#-self-host-llm-apps-with-docker-compose-or-kubernetes) for more details.
Langchain-serve also allows to deploy the apps with WebSocket APIs and Slack Bots both on [Jina AI Cloud](https://cloud.jina.ai/) or self-hosted infrastructure.
@@ -0,0 +1,23 @@
# LanceDB
This page covers how to use [LanceDB](https://github.com/lancedb/lancedb) within LangChain.
It is broken into two parts: installation and setup, and then references to specific LanceDB wrappers.
## Installation and Setup
- Install the Python SDK with `pip install lancedb`
## Wrappers
### VectorStore
There exists a wrapper around LanceDB databases, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import LanceDB
```
For a more detailed walkthrough of the LanceDB wrapper, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/lancedb.html)
@@ -0,0 +1,368 @@
# LangChain Decorators ✨
lanchchain decorators is a layer on the top of LangChain that provides syntactic sugar 🍭 for writing custom langchain prompts and chains
For Feedback, Issues, Contributions - please raise an issue here:
[ju-bezdek/langchain-decorators](https://github.com/ju-bezdek/langchain-decorators)
Main principles and benefits:
- more `pythonic` way of writing code
- write multiline prompts that won't break your code flow with indentation
- making use of IDE in-built support for **hinting**, **type checking** and **popup with docs** to quickly peek in the function to see the prompt, parameters it consumes etc.
- leverage all the power of 🦜🔗 LangChain ecosystem
- adding support for **optional parameters**
- easily share parameters between the prompts by binding them to one class
Here is a simple example of a code written with **LangChain Decorators ✨**
``` python
@llm_prompt
def write_me_short_post(topic:str, platform:str="twitter", audience:str = "developers")->str:
"""
Write me a short header for my post about {topic} for {platform} platform.
It should be for {audience} audience.
(Max 15 words)
"""
return
# run it naturally
write_me_short_post(topic="starwars")
# or
write_me_short_post(topic="starwars", platform="redit")
```
# Quick start
## Installation
```bash
pip install langchain_decorators
```
## Examples
Good idea on how to start is to review the examples here:
- [jupyter notebook](https://github.com/ju-bezdek/langchain-decorators/blob/main/example_notebook.ipynb)
- [colab notebook](https://colab.research.google.com/drive/1no-8WfeP6JaLD9yUtkPgym6x0G9ZYZOG#scrollTo=N4cf__D0E2Yk)
# Defining other parameters
Here we are just marking a function as a prompt with `llm_prompt` decorator, turning it effectively into a LLMChain. Instead of running it
Standard LLMchain takes much more init parameter than just inputs_variables and prompt... here is this implementation detail hidden in the decorator.
Here is how it works:
1. Using **Global settings**:
``` python
# define global settings for all prompty (if not set - chatGPT is the current default)
from langchain_decorators import GlobalSettings
GlobalSettings.define_settings(
default_llm=ChatOpenAI(temperature=0.0), this is default... can change it here globally
default_streaming_llm=ChatOpenAI(temperature=0.0,streaming=True), this is default... can change it here for all ... will be used for streaming
)
```
2. Using predefined **prompt types**
``` python
#You can change the default prompt types
from langchain_decorators import PromptTypes, PromptTypeSettings
PromptTypes.AGENT_REASONING.llm = ChatOpenAI()
# Or you can just define your own ones:
class MyCustomPromptTypes(PromptTypes):
GPT4=PromptTypeSettings(llm=ChatOpenAI(model="gpt-4"))
@llm_prompt(prompt_type=MyCustomPromptTypes.GPT4)
def write_a_complicated_code(app_idea:str)->str:
...
```
3. Define the settings **directly in the decorator**
``` python
from langchain.llms import OpenAI
@llm_prompt(
llm=OpenAI(temperature=0.7),
stop_tokens=["\nObservation"],
...
)
def creative_writer(book_title:str)->str:
...
```
## Passing a memory and/or callbacks:
To pass any of these, just declare them in the function (or use kwargs to pass anything)
```python
@llm_prompt()
async def write_me_short_post(topic:str, platform:str="twitter", memory:SimpleMemory = None):
"""
{history_key}
Write me a short header for my post about {topic} for {platform} platform.
It should be for {audience} audience.
(Max 15 words)
"""
pass
await write_me_short_post(topic="old movies")
```
# Simplified streaming
If we want to leverage streaming:
- we need to define prompt as async function
- turn on the streaming on the decorator, or we can define PromptType with streaming on
- capture the stream using StreamingContext
This way we just mark which prompt should be streamed, not needing to tinker with what LLM should we use, passing around the creating and distribute streaming handler into particular part of our chain... just turn the streaming on/off on prompt/prompt type...
The streaming will happen only if we call it in streaming context ... there we can define a simple function to handle the stream
``` python
# this code example is complete and should run as it is
from langchain_decorators import StreamingContext, llm_prompt
# this will mark the prompt for streaming (useful if we want stream just some prompts in our app... but don't want to pass distribute the callback handlers)
# note that only async functions can be streamed (will get an error if it's not)
@llm_prompt(capture_stream=True)
async def write_me_short_post(topic:str, platform:str="twitter", audience:str = "developers"):
"""
Write me a short header for my post about {topic} for {platform} platform.
It should be for {audience} audience.
(Max 15 words)
"""
pass
# just an arbitrary function to demonstrate the streaming... will be some websockets code in the real world
tokens=[]
def capture_stream_func(new_token:str):
tokens.append(new_token)
# if we want to capture the stream, we need to wrap the execution into StreamingContext...
# this will allow us to capture the stream even if the prompt call is hidden inside higher level method
# only the prompts marked with capture_stream will be captured here
with StreamingContext(stream_to_stdout=True, callback=capture_stream_func):
result = await run_prompt()
print("Stream finished ... we can distinguish tokens thanks to alternating colors")
print("\nWe've captured",len(tokens),"tokens🎉\n")
print("Here is the result:")
print(result)
```
# Prompt declarations
By default the prompt is is the whole function docs, unless you mark your prompt
## Documenting your prompt
We can specify what part of our docs is the prompt definition, by specifying a code block with `<prompt>` language tag
``` python
@llm_prompt
def write_me_short_post(topic:str, platform:str="twitter", audience:str = "developers"):
"""
Here is a good way to write a prompt as part of a function docstring, with additional documentation for devs.
It needs to be a code block, marked as a `<prompt>` language
```<prompt>
Write me a short header for my post about {topic} for {platform} platform.
It should be for {audience} audience.
(Max 15 words)
```
Now only to code block above will be used as a prompt, and the rest of the docstring will be used as a description for developers.
(It has also a nice benefit that IDE (like VS code) will display the prompt properly (not trying to parse it as markdown, and thus not showing new lines properly))
"""
return
```
## Chat messages prompt
For chat models is very useful to define prompt as a set of message templates... here is how to do it:
``` python
@llm_prompt
def simulate_conversation(human_input:str, agent_role:str="a pirate"):
"""
## System message
- note the `:system` sufix inside the <prompt:_role_> tag
```<prompt:system>
You are a {agent_role} hacker. You mus act like one.
You reply always in code, using python or javascript code block...
for example:
... do not reply with anything else.. just with code - respecting your role.
```
# human message
(we are using the real role that are enforced by the LLM - GPT supports system, assistant, user)
``` <prompt:user>
Helo, who are you
```
a reply:
``` <prompt:assistant>
\``` python <<- escaping inner code block with \ that should be part of the prompt
def hello():
print("Argh... hello you pesky pirate")
\```
```
we can also add some history using placeholder
```<prompt:placeholder>
{history}
```
```<prompt:user>
{human_input}
```
Now only to code block above will be used as a prompt, and the rest of the docstring will be used as a description for developers.
(It has also a nice benefit that IDE (like VS code) will display the prompt properly (not trying to parse it as markdown, and thus not showing new lines properly))
"""
pass
```
the roles here are model native roles (assistant, user, system for chatGPT)
# Optional sections
- you can define a whole sections of your prompt that should be optional
- if any input in the section is missing, the whole section won't be rendered
the syntax for this is as follows:
``` python
@llm_prompt
def prompt_with_optional_partials():
"""
this text will be rendered always, but
{? anything inside this block will be rendered only if all the {value}s parameters are not empty (None | "") ?}
you can also place it in between the words
this too will be rendered{? , but
this block will be rendered only if {this_value} and {this_value}
is not empty?} !
"""
```
# Output parsers
- llm_prompt decorator natively tries to detect the best output parser based on the output type. (if not set, it returns the raw string)
- list, dict and pydantic outputs are also supported natively (automatically)
``` python
# this code example is complete and should run as it is
from langchain_decorators import llm_prompt
@llm_prompt
def write_name_suggestions(company_business:str, count:int)->list:
""" Write me {count} good name suggestions for company that {company_business}
"""
pass
write_name_suggestions(company_business="sells cookies", count=5)
```
## More complex structures
for dict / pydantic you need to specify the formatting instructions...
this can be tedious, that's why you can let the output parser gegnerate you the instructions based on the model (pydantic)
``` python
from langchain_decorators import llm_prompt
from pydantic import BaseModel, Field
class TheOutputStructureWeExpect(BaseModel):
name:str = Field (description="The name of the company")
headline:str = Field( description="The description of the company (for landing page)")
employees:list[str] = Field(description="5-8 fake employee names with their positions")
@llm_prompt()
def fake_company_generator(company_business:str)->TheOutputStructureWeExpect:
""" Generate a fake company that {company_business}
{FORMAT_INSTRUCTIONS}
"""
return
company = fake_company_generator(company_business="sells cookies")
# print the result nicely formatted
print("Company name: ",company.name)
print("company headline: ",company.headline)
print("company employees: ",company.employees)
```
# Binding the prompt to an object
``` python
from pydantic import BaseModel
from langchain_decorators import llm_prompt
class AssistantPersonality(BaseModel):
assistant_name:str
assistant_role:str
field:str
@property
def a_property(self):
return "whatever"
def hello_world(self, function_kwarg:str=None):
"""
We can reference any {field} or {a_property} inside our prompt... and combine it with {function_kwarg} in the method
"""
@llm_prompt
def introduce_your_self(self)->str:
"""
``` <prompt:system>
You are an assistant named {assistant_name}.
Your role is to act as {assistant_role}
```
```<prompt:user>
Introduce your self (in less than 20 words)
```
"""
personality = AssistantPersonality(assistant_name="John", assistant_role="a pirate")
print(personality.introduce_your_self(personality))
```
# More examples:
- these and few more examples are also available in the [colab notebook here](https://colab.research.google.com/drive/1no-8WfeP6JaLD9yUtkPgym6x0G9ZYZOG#scrollTo=N4cf__D0E2Yk)
- including the [ReAct Agent re-implementation](https://colab.research.google.com/drive/1no-8WfeP6JaLD9yUtkPgym6x0G9ZYZOG#scrollTo=3bID5fryE2Yp) using purely langchain decorators
@@ -0,0 +1,26 @@
# Llama.cpp
This page covers how to use [llama.cpp](https://github.com/ggerganov/llama.cpp) within LangChain.
It is broken into two parts: installation and setup, and then references to specific Llama-cpp wrappers.
## Installation and Setup
- Install the Python package with `pip install llama-cpp-python`
- Download one of the [supported models](https://github.com/ggerganov/llama.cpp#description) and convert them to the llama.cpp format per the [instructions](https://github.com/ggerganov/llama.cpp)
## Wrappers
### LLM
There exists a LlamaCpp LLM wrapper, which you can access with
```python
from langchain.llms import LlamaCpp
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/model_io/models/llms/integrations/llamacpp.html)
### Embeddings
There exists a LlamaCpp Embeddings wrapper, which you can access with
```python
from langchain.embeddings import LlamaCppEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_connection/text_embedding/integrations/llamacpp.html)
@@ -0,0 +1,31 @@
# Marqo
This page covers how to use the Marqo ecosystem within LangChain.
### **What is Marqo?**
Marqo is a tensor search engine that uses embeddings stored in in-memory HNSW indexes to achieve cutting edge search speeds. Marqo can scale to hundred-million document indexes with horizontal index sharding and allows for async and non-blocking data upload and search. Marqo uses the latest machine learning models from PyTorch, Huggingface, OpenAI and more. You can start with a pre-configured model or bring your own. The built in ONNX support and conversion allows for faster inference and higher throughput on both CPU and GPU.
Because Marqo include its own inference your documents can have a mix of text and images, you can bring Marqo indexes with data from your other systems into the langchain ecosystem without having to worry about your embeddings being compatible.
Deployment of Marqo is flexible, you can get started yourself with our docker image or [contact us about our managed cloud offering!](https://www.marqo.ai/pricing)
To run Marqo locally with our docker image, [see our getting started.](https://docs.marqo.ai/latest/)
## Installation and Setup
- Install the Python SDK with `pip install marqo`
## Wrappers
### VectorStore
There exists a wrapper around Marqo indexes, allowing you to use them within the vectorstore framework. Marqo lets you select from a range of models for generating embeddings and exposes some preprocessing configurations.
The Marqo vectorstore can also work with existing multimodel indexes where your documents have a mix of images and text, for more information refer to [our documentation](https://docs.marqo.ai/latest/#multi-modal-and-cross-modal-search). Note that instaniating the Marqo vectorstore with an existing multimodal index will disable the ability to add any new documents to it via the langchain vectorstore `add_texts` method.
To import this vectorstore:
```python
from langchain.vectorstores import Marqo
```
For a more detailed walkthrough of the Marqo wrapper and some of its unique features, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/marqo.html)
@@ -0,0 +1,31 @@
# MediaWikiDump
>[MediaWiki XML Dumps](https://www.mediawiki.org/wiki/Manual:Importing_XML_dumps) contain the content of a wiki
> (wiki pages with all their revisions), without the site-related data. A XML dump does not create a full backup
> of the wiki database, the dump does not contain user accounts, images, edit logs, etc.
## Installation and Setup
We need to install several python packages.
The `mediawiki-utilities` supports XML schema 0.11 in unmerged branches.
```bash
pip install -qU git+https://github.com/mediawiki-utilities/python-mwtypes@updates_schema_0.11
```
The `mediawiki-utilities mwxml` has a bug, fix PR pending.
```bash
pip install -qU git+https://github.com/gdedrouas/python-mwxml@xml_format_0.11
pip install -qU mwparserfromhell
```
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/mediawikidump.html).
```python
from langchain.document_loaders import MWDumpLoader
```
@@ -0,0 +1,26 @@
# Metal
This page covers how to use [Metal](https://getmetal.io) within LangChain.
## What is Metal?
Metal is a managed retrieval & memory platform built for production. Easily index your data into `Metal` and run semantic search and retrieval on it.
![Metal](/img/MetalDash.png)
## Quick start
Get started by [creating a Metal account](https://app.getmetal.io/signup).
Then, you can easily take advantage of the `MetalRetriever` class to start retrieving your data for semantic search, prompting context, etc. This class takes a `Metal` instance and a dictionary of parameters to pass to the Metal API.
```python
from langchain.retrievers import MetalRetriever
from metal_sdk.metal import Metal
metal = Metal("API_KEY", "CLIENT_ID", "INDEX_ID");
retriever = MetalRetriever(metal, params={"limit": 2})
docs = retriever.get_relevant_documents("search term")
```
@@ -0,0 +1,22 @@
# Microsoft OneDrive
>[Microsoft OneDrive](https://en.wikipedia.org/wiki/OneDrive) (formerly `SkyDrive`) is a file-hosting service operated by Microsoft.
## Installation and Setup
First, you need to install a python package.
```bash
pip install o365
```
Then follow instructions [here](/docs/modules/data_connection/document_loaders/integrations/microsoft_onedrive.html).
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/microsoft_onedrive.html).
```python
from langchain.document_loaders import OneDriveLoader
```
@@ -0,0 +1,16 @@
# Microsoft PowerPoint
>[Microsoft PowerPoint](https://en.wikipedia.org/wiki/Microsoft_PowerPoint) is a presentation program by Microsoft.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/microsoft_powerpoint.html).
```python
from langchain.document_loaders import UnstructuredPowerPointLoader
```
@@ -0,0 +1,16 @@
# Microsoft Word
>[Microsoft Word](https://www.microsoft.com/en-us/microsoft-365/word) is a word processor developed by Microsoft.
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/microsoft_word.html).
```python
from langchain.document_loaders import UnstructuredWordDocumentLoader
```
@@ -0,0 +1,20 @@
# Milvus
This page covers how to use the Milvus ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Milvus wrappers.
## Installation and Setup
- Install the Python SDK with `pip install pymilvus`
## Wrappers
### VectorStore
There exists a wrapper around Milvus indexes, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import Milvus
```
For a more detailed walkthrough of the Miluvs wrapper, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/milvus.html)
@@ -0,0 +1,116 @@
# MLflow AI Gateway
The MLflow AI Gateway service is a powerful tool designed to streamline the usage and management of various large language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests. See [the MLflow AI Gateway documentation](https://mlflow.org/docs/latest/gateway/index.html) for more details.
## Installation and Setup
Install `mlflow` with MLflow AI Gateway dependencies:
```sh
pip install 'mlflow[gateway]'
```
Set the OpenAI API key as an environment variable:
```sh
export OPENAI_API_KEY=...
```
Create a configuration file:
```yaml
routes:
- name: completions
route_type: llm/v1/completions
model:
provider: openai
name: text-davinci-003
config:
openai_api_key: $OPENAI_API_KEY
- name: embeddings
route_type: llm/v1/embeddings
model:
provider: openai
name: text-embedding-ada-002
config:
openai_api_key: $OPENAI_API_KEY
```
Start the Gateway server:
```sh
mlflow gateway start --config-path /path/to/config.yaml
```
## Completions Example
```python
import mlflow
from langchain import LLMChain, PromptTemplate
from langchain.llms import MlflowAIGateway
gateway = MlflowAIGateway(
gateway_uri="http://127.0.0.1:5000",
route="completions",
params={
"temperature": 0.0,
"top_p": 0.1,
},
)
llm_chain = LLMChain(
llm=gateway,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
result = llm_chain.run(adjective="funny")
print(result)
with mlflow.start_run():
model_info = mlflow.langchain.log_model(chain, "model")
model = mlflow.pyfunc.load_model(model_info.model_uri)
print(model.predict([{"adjective": "funny"}]))
```
## Embeddings Example
```python
from langchain.embeddings import MlflowAIGatewayEmbeddings
embeddings = MlflowAIGatewayEmbeddings(
gateway_uri="http://127.0.0.1:5000",
route="embeddings",
)
print(embeddings.embed_query("hello"))
print(embeddings.embed_documents(["hello"]))
```
## Databricks MLflow AI Gateway
Databricks MLflow AI Gateway is in private preview.
Please contact a Databricks representative to enroll in the preview.
```python
from langchain import LLMChain, PromptTemplate
from langchain.llms import MlflowAIGateway
gateway = MlflowAIGateway(
gateway_uri="databricks",
route="completions",
)
llm_chain = LLMChain(
llm=gateway,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
result = llm_chain.run(adjective="funny")
print(result)
```
@@ -0,0 +1,185 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# MLflow\n",
"\n",
"This notebook goes over how to track your LangChain experiments into your MLflow Server"
],
"id": "5d184f91"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install azureml-mlflow\n",
"!pip install pandas\n",
"!pip install textstat\n",
"!pip install spacy\n",
"!pip install openai\n",
"!pip install google-search-results\n",
"!python -m spacy download en_core_web_sm"
],
"id": "ca7bd72f"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"MLFLOW_TRACKING_URI\"] = \"\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"\""
],
"id": "bf8e1f5c"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks import MlflowCallbackHandler\n",
"from langchain.llms import OpenAI"
],
"id": "fd49fd45"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\"\"\"Main function.\n",
"\n",
"This function is used to try the callback handler.\n",
"Scenarios:\n",
"1. OpenAI LLM\n",
"2. Chain with multiple SubChains on multiple generations\n",
"3. Agent with Tools\n",
"\"\"\"\n",
"mlflow_callback = MlflowCallbackHandler()\n",
"llm = OpenAI(\n",
" model_name=\"gpt-3.5-turbo\", temperature=0, callbacks=[mlflow_callback], verbose=True\n",
")"
],
"id": "578cac8c"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# SCENARIO 1 - LLM\n",
"llm_result = llm.generate([\"Tell me a joke\"])\n",
"\n",
"mlflow_callback.flush_tracker(llm)"
],
"id": "9b20acae"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain"
],
"id": "8b872046"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# SCENARIO 2 - Chain\n",
"template = \"\"\"You are a playwright. Given the title of play, it is your job to write a synopsis for that title.\n",
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=[mlflow_callback])\n",
"\n",
"test_prompts = [\n",
" {\n",
" \"title\": \"documentary about good video games that push the boundary of game design\"\n",
" },\n",
"]\n",
"synopsis_chain.apply(test_prompts)\n",
"mlflow_callback.flush_tracker(synopsis_chain)"
],
"id": "1b2627ef"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_jN73xcPVEpI"
},
"outputs": [],
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.agents import AgentType"
],
"id": "e002823a"
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Gpq4rk6VT9cu"
},
"outputs": [],
"source": [
"# SCENARIO 3 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=[mlflow_callback])\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callbacks=[mlflow_callback],\n",
" verbose=True,\n",
")\n",
"agent.run(\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
")\n",
"mlflow_callback.flush_tracker(agent, finish=True)"
],
"id": "655bd47e"
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
@@ -0,0 +1,95 @@
# Modal
This page covers how to use the Modal ecosystem to run LangChain custom LLMs.
It is broken into two parts:
1. Modal installation and web endpoint deployment
2. Using deployed web endpoint with `LLM` wrapper class.
## Installation and Setup
- Install with `pip install modal`
- Run `modal token new`
## Define your Modal Functions and Webhooks
You must include a prompt. There is a rigid response structure:
```python
class Item(BaseModel):
prompt: str
@stub.function()
@modal.web_endpoint(method="POST")
def get_text(item: Item):
return {"prompt": run_gpt2.call(item.prompt)}
```
The following is an example with the GPT2 model:
```python
from pydantic import BaseModel
import modal
CACHE_PATH = "/root/model_cache"
class Item(BaseModel):
prompt: str
stub = modal.Stub(name="example-get-started-with-langchain")
def download_model():
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
tokenizer.save_pretrained(CACHE_PATH)
model.save_pretrained(CACHE_PATH)
# Define a container image for the LLM function below, which
# downloads and stores the GPT-2 model.
image = modal.Image.debian_slim().pip_install(
"tokenizers", "transformers", "torch", "accelerate"
).run_function(download_model)
@stub.function(
gpu="any",
image=image,
retries=3,
)
def run_gpt2(text: str):
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained(CACHE_PATH)
model = GPT2LMHeadModel.from_pretrained(CACHE_PATH)
encoded_input = tokenizer(text, return_tensors='pt').input_ids
output = model.generate(encoded_input, max_length=50, do_sample=True)
return tokenizer.decode(output[0], skip_special_tokens=True)
@stub.function()
@modal.web_endpoint(method="POST")
def get_text(item: Item):
return {"prompt": run_gpt2.call(item.prompt)}
```
### Deploy the web endpoint
Deploy the web endpoint to Modal cloud with the [`modal deploy`](https://modal.com/docs/reference/cli/deploy) CLI command.
Your web endpoint will acquire a persistent URL under the `modal.run` domain.
## LLM wrapper around Modal web endpoint
The `Modal` LLM wrapper class which will accept your deployed web endpoint's URL.
```python
from langchain.llms import Modal
endpoint_url = "https://ecorp--custom-llm-endpoint.modal.run" # REPLACE ME with your deployed Modal web endpoint's URL
llm = Modal(endpoint_url=endpoint_url)
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
llm_chain.run(question)
```
@@ -0,0 +1,20 @@
# ModelScope
This page covers how to use the modelscope ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific modelscope wrappers.
## Installation and Setup
* Install the Python SDK with `pip install modelscope`
## Wrappers
### Embeddings
There exists a modelscope Embeddings wrapper, which you can access with
```python
from langchain.embeddings import ModelScopeEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_connection/text_embedding/integrations/modelscope_hub.html)
@@ -0,0 +1,19 @@
# Modern Treasury
>[Modern Treasury](https://www.moderntreasury.com/) simplifies complex payment operations. It is a unified platform to power products and processes that move money.
>- Connect to banks and payment systems
>- Track transactions and balances in real-time
>- Automate payment operations for scale
## Installation and Setup
There isn't any special setup for it.
## Document Loader
See a [usage example](/docs/modules/data_connection/document_loaders/integrations/modern_treasury.html).
```python
from langchain.document_loaders import ModernTreasuryLoader
```
@@ -0,0 +1,54 @@
# Momento
>[Momento Cache](https://docs.momentohq.com/) is the world's first truly serverless caching service. It provides instant elasticity, scale-to-zero
> capability, and blazing-fast performance.
> With Momento Cache, you grab the SDK, you get an end point, input a few lines into your code, and you're off and running.
This page covers how to use the [Momento](https://gomomento.com) ecosystem within LangChain.
## Installation and Setup
- Sign up for a free account [here](https://docs.momentohq.com/getting-started) and get an auth token
- Install the Momento Python SDK with `pip install momento`
## Cache
The Cache wrapper allows for [Momento](https://gomomento.com) to be used as a serverless, distributed, low-latency cache for LLM prompts and responses.
The standard cache is the go-to use case for [Momento](https://gomomento.com) users in any environment.
Import the cache as follows:
```python
from langchain.cache import MomentoCache
```
And set up like so:
```python
from datetime import timedelta
from momento import CacheClient, Configurations, CredentialProvider
import langchain
# Instantiate the Momento client
cache_client = CacheClient(
Configurations.Laptop.v1(),
CredentialProvider.from_environment_variable("MOMENTO_AUTH_TOKEN"),
default_ttl=timedelta(days=1))
# Choose a Momento cache name of your choice
cache_name = "langchain"
# Instantiate the LLM cache
langchain.llm_cache = MomentoCache(cache_client, cache_name)
```
## Memory
Momento can be used as a distributed memory store for LLMs.
### Chat Message History Memory
See [this notebook](/docs/modules/memory/integrations/momento_chat_message_history.html) for a walkthrough of how to use Momento as a memory store for chat message history.
@@ -0,0 +1,50 @@
# Motherduck
>[Motherduck](https://motherduck.com/) is a managed DuckDB-in-the-cloud service.
## Installation and Setup
First, you need to install `duckdb` python package.
```bash
pip install duckdb
```
You will also need to sign up for an account at [Motherduck](https://motherduck.com/)
After that, you should set up a connection string - we mostly integrate with Motherduck through SQLAlchemy.
The connection string is likely in the form:
```
token="..."
conn_str = f"duckdb:///md:{token}@my_db"
```
## SQLChain
You can use the SQLChain to query data in your Motherduck instance in natural language.
```
from langchain import OpenAI, SQLDatabase, SQLDatabaseChain
db = SQLDatabase.from_uri(conn_str)
db_chain = SQLDatabaseChain.from_llm(OpenAI(temperature=0), db, verbose=True)
```
From here, see the [SQL Chain](/docs/modules/chains/popular/sqlite.html) documentation on how to use.
## LLMCache
You can also easily use Motherduck to cache LLM requests.
Once again this is done through the SQLAlchemy wrapper.
```
import sqlalchemy
eng = sqlalchemy.create_engine(conn_str)
langchain.llm_cache = SQLAlchemyCache(engine=eng)
```
From here, see the [LLM Caching](/docs/modules/model_io/models/llms/how_to/llm_caching) documentation on how to use.
@@ -0,0 +1,65 @@
# MyScale
This page covers how to use MyScale vector database within LangChain.
It is broken into two parts: installation and setup, and then references to specific MyScale wrappers.
With MyScale, you can manage both structured and unstructured (vectorized) data, and perform joint queries and analytics on both types of data using SQL. Plus, MyScale's cloud-native OLAP architecture, built on top of ClickHouse, enables lightning-fast data processing even on massive datasets.
## Introduction
[Overview to MyScale and High performance vector search](https://docs.myscale.com/en/overview/)
You can now register on our SaaS and [start a cluster now!](https://docs.myscale.com/en/quickstart/)
If you are also interested in how we managed to integrate SQL and vector, please refer to [this document](https://docs.myscale.com/en/vector-reference/) for further syntax reference.
We also deliver with live demo on huggingface! Please checkout our [huggingface space](https://huggingface.co/myscale)! They search millions of vector within a blink!
## Installation and Setup
- Install the Python SDK with `pip install clickhouse-connect`
### Setting up environments
There are two ways to set up parameters for myscale index.
1. Environment Variables
Before you run the app, please set the environment variable with `export`:
`export MYSCALE_HOST='<your-endpoints-url>' MYSCALE_PORT=<your-endpoints-port> MYSCALE_USERNAME=<your-username> MYSCALE_PASSWORD=<your-password> ...`
You can easily find your account, password and other info on our SaaS. For details please refer to [this document](https://docs.myscale.com/en/cluster-management/)
Every attributes under `MyScaleSettings` can be set with prefix `MYSCALE_` and is case insensitive.
2. Create `MyScaleSettings` object with parameters
```python
from langchain.vectorstores import MyScale, MyScaleSettings
config = MyScaleSetting(host="<your-backend-url>", port=8443, ...)
index = MyScale(embedding_function, config)
index.add_documents(...)
```
## Wrappers
supported functions:
- `add_texts`
- `add_documents`
- `from_texts`
- `from_documents`
- `similarity_search`
- `asimilarity_search`
- `similarity_search_by_vector`
- `asimilarity_search_by_vector`
- `similarity_search_with_relevance_scores`
### VectorStore
There exists a wrapper around MyScale database, allowing you to use it as a vectorstore,
whether for semantic search or similar example retrieval.
To import this vectorstore:
```python
from langchain.vectorstores import MyScale
```
For a more detailed walkthrough of the MyScale wrapper, see [this notebook](/docs/modules/data_connection/vectorstores/integrations/myscale.html)
@@ -0,0 +1,17 @@
# NLPCloud
This page covers how to use the NLPCloud ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific NLPCloud wrappers.
## Installation and Setup
- Install the Python SDK with `pip install nlpcloud`
- Get an NLPCloud api key and set it as an environment variable (`NLPCLOUD_API_KEY`)
## Wrappers
### LLM
There exists an NLPCloud LLM wrapper, which you can access with
```python
from langchain.llms import NLPCloud
```

Some files were not shown because too many files have changed in this diff Show More