Commit Graph

48 Commits

Author SHA1 Message Date
Harrison Chase a5bf8c9b9d Harrison/aleph alpha embeddings (#2117)
Co-authored-by: Piotr Mazurek <piotr635@gmail.com>
Co-authored-by: PiotrMazurek <piotr.mazurek@aleph-alpha.com>
2023-03-28 15:18:03 -07:00
Harrison Chase 410bf37fb8 Harrison/big query (#2100)
Co-authored-by: lu-cashmoney <lucas.corley@gmail.com>
2023-03-28 08:17:22 -07:00
Stéphane Busso 0bee219cb3 feat: Add Notion database document loader (#2056)
This PR adds Notion DB loader for langchain. 

It reads content from pages within a Notion Database. It uses the Notion
API to query the database and read the pages. It also reads the metadata
from the pages and stores it in the Document object.
2023-03-28 08:07:09 -07:00
Harrison Chase d5825bd3e8 Harrison/whatsapp loader (#2085)
Co-authored-by: Moshe <hello@moshemalka.me>
2023-03-27 23:43:45 -07:00
Harrison Chase f74a1bebf5 Harrison/duckdb (#2064)
Co-authored-by: Trent Hauck <trent@trenthauck.com>
2023-03-27 19:51:34 -07:00
Harrison Chase 76ecca4d53 redis retriever (#2060) 2023-03-27 19:51:23 -07:00
Harrison Chase 30e3b31b04 Harrison/document cleanup (#2062)
Co-authored-by: Delip Rao <delip@users.noreply.github.com>
2023-03-27 16:32:55 -07:00
Harrison Chase a0cd6672aa Harrison/site map (#2061)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-27 16:28:08 -07:00
Harrison Chase 705431aecc big docs refactor (#1978)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-26 19:49:46 -07:00
Harrison Chase 6ec5780547 add docs for openai retriever ingest (#1969) 2023-03-24 08:24:33 -07:00
Harrison Chase 47d37db2d2 WIP: Harrison/base retriever (#1765) 2023-03-24 07:46:49 -07:00
Harrison Chase 8990122d5d retrievers interface (#1948) 2023-03-23 19:00:38 -07:00
Harrison Chase 50626a10ee Hx23840 feat/add redisearch vectorstore (#1909)
Co-authored-by: Peter <peter.shi@alephf.com>
Co-authored-by: Peter Shi <42536066+hx23840@users.noreply.github.com>
2023-03-22 19:57:56 -07:00
Sean Zheng 15b5a08f4b Update how_to_guides.rst (#1893)
Adding OpenSearch examples
2023-03-22 14:30:43 -07:00
Harrison Chase f299bd1416 clean up sagemaker nb (#1875) 2023-03-21 22:00:08 -07:00
Philipp Schmid 064be93edf [Embeddings] Add SageMaker Endpoint Embedding class (#1859)
# What does this PR do? 

This PR adds similar to `llms` a SageMaker-powered `embeddings` class.
This is helpful if you want to leverage Hugging Face models on SageMaker
for creating your indexes.

I added a example into the
[docs/modules/indexes/examples/embeddings.ipynb](https://github.com/hwchase17/langchain/compare/master...philschmid:add-sm-embeddings?expand=1#diff-e82629e2894974ec87856aedd769d4bdfe400314b03734f32bee5990bc7e8062)
document. The example currently includes some `_### TEMPORARY: Showing
how to deploy a SageMaker Endpoint from a Hugging Face model ###_ ` code
showing how you can deploy a sentence-transformers to SageMaker and then
run the methods of the embeddings class.

@hwchase17 please let me know if/when i should remove the `_###
TEMPORARY: Showing how to deploy a SageMaker Endpoint from a Hugging
Face model ###_` in the description i linked to a detail blog on how to
deploy a Sentence Transformers so i think we don't need to include those
steps here.

I also reused the `ContentHandlerBase` from
`langchain.llms.sagemaker_endpoint` and changed the output type to `any`
since it is depending on the implementation.
2023-03-21 21:51:48 -07:00
anupam-tiwari 86822d1cc2 Fixes the import typo in the vector db text generator notebook (#1874)
Fixes the import typo in the vector db text generator notebook for the
chroma library

Co-authored-by: Anupam <anupam@10-16-252-145.dynapool.wireless.nyu.edu>
2023-03-21 21:48:26 -07:00
Tomoko Uchida b706966ebc Add setup instruction in Getting Started for Indexing (#1847)
`VectorstoreIndexCreator` [uses Chroma as the vectorstore by
default](https://github.com/hwchase17/langchain/blob/1c22657256a69ecf739134da7d9cec5e9365a75f/langchain/indexes/vectorstore.py#L49).
It may be helpful to add a short note for the setup.

You can see how the notebook looks here.

https://github.com/mocobeta/langchain/blob/feat/add-setup-instruction-to-index-getting-started/docs/modules/indexes/getting_started.ipynb
2023-03-21 09:06:35 -07:00
Harrison Chase 1c22657256 Harrison/faiss merge (#1843)
Co-authored-by: Ting Su <ting.su.1995@outlook.com>
2023-03-20 22:54:08 -07:00
Harrison Chase d5d50c39e6 Harrison/azure embeddings (#1787)
Co-authored-by: Hemant <4627288+ghaccount@users.noreply.github.com>
2023-03-19 10:42:33 -07:00
Harrison Chase 96ebe98dc2 Harrison/latex splitter (#1738)
Co-authored-by: Aidan Holland <thehappydinoa@gmail.com>
Co-authored-by: Jan de Boer <44832123+Janldeboer@users.noreply.github.com>
2023-03-17 08:10:27 -07:00
Piyush Jain 1279c8de39 Fixed typo, clarified language (#1682) 2023-03-15 08:00:11 -07:00
Jithin James 6f4f771897 docs: add path to state_of_the_union.txt in indexes/getting_started page (#1691)
add the state_of_the_union.txt file so that its easier to follow through
with the example.

---------

Co-authored-by: Jithin James <jjmachan@pop-os.localdomain>
2023-03-15 07:59:47 -07:00
Harrison Chase 0b29e68c17 Harrison/pgvector (#1679)
Co-authored-by: Aman Kumar <krsingh.aman@gmail.com>
2023-03-14 21:13:58 -07:00
Harrison Chase 4d7fdb8957 Harrison/gml save (#1676)
Co-authored-by: Satoru Sakamoto <51464932+satoru814@users.noreply.github.com>
2023-03-14 20:00:22 -07:00
Xin Qiu 4e13cef05a feat: add redisearch vectorstore (#1307)
# Description

Add `RediSearch` vectorstore for LangChain

RediSearch: [RediSearch quick
start](https://redis.io/docs/stack/search/quick_start/)

# How to use

```
from langchain.vectorstores.redisearch import RediSearch

rds = RediSearch.from_documents(docs, embeddings,redisearch_url="redis://localhost:6379")
```
2023-03-14 18:06:03 -07:00
Harrison Chase c9b5a30b37 move output parsing (#1605) 2023-03-11 16:41:03 -08:00
Andriy Mulyar c9189d354a AtlasDB vector store documentation updates. (#1572)
- Updated errors in the AtlasDB vector store documentation
- Removed extraneous output logs in example notebook.
2023-03-09 16:31:14 -08:00
Brenton Wheeler 99fe023496 docs: fix typo in modules/indexes/chain_examples/question_answering (#1551)
docs: fix typo in modules/indexes/chain_examples/question_answering


![image](https://user-images.githubusercontent.com/11394076/224007874-3a52adf6-ff7a-4f22-9dbf-18c83d08167f.png)
2023-03-09 09:11:43 -08:00
Harrison Chase 523ad8d2e2 Harrison/chat history formatter1 (#1538)
Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>
2023-03-08 20:46:37 -08:00
Harrison Chase 3610ef2830 add fake embeddings class (#1503) 2023-03-07 15:23:46 -08:00
Harrison Chase 0e21463f07 (rfc) chat models (#1424)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-06 08:34:24 -08:00
Harrison Chase a1b9dfc099 Harrison/similarity search chroma (#1434)
Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
2023-03-04 08:10:15 -08:00
Tim Asp bca0935d90 [docs] fix minor import error (#1425) 2023-03-03 16:10:07 -08:00
Kacper Łukawski 9ac442624c Add Qdrant named arguments (#1386)
This PR:
- Increases `qdrant-client` version to 1.0.4
- Introduces custom content and metadata keys (as requested in #1087)
- Moves all the `QdrantClient` parameters into the method parameters to
simplify code completion
2023-03-02 07:05:14 -08:00
Lakshya Agarwal cfed0497ac Minor grammatical fixes (#1325)
Fixed typos and links in a few places across documents
2023-03-01 21:18:09 -08:00
Harrison Chase f61858163d bump version to 0.0.95 (#1324) 2023-02-27 07:45:54 -08:00
Harrison Chase 0824d65a5c Harrison/indexing pipeline (#1317) 2023-02-27 00:31:36 -08:00
Harrison Chase 166cda2cc6 Harrison/deeplake (#1316)
Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-02-26 22:35:04 -08:00
Harrison Chase aaad6cc954 Harrison/atlas db (#1315)
Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>
2023-02-26 22:11:38 -08:00
Marc Puig 3989c793fd Making it possible to use "certainty" as a parameter for the weaviate similarity_search (#1218)
Checking if weaviate similarity_search kwargs contains "certainty" and
use it accordingly. The minimal level of certainty must be a float, and
it is computed by normalized distance.
2023-02-26 17:55:28 -08:00
Sason cc7d2e5621 Correct typo in "Question Answering" How-To Guide (#1221) 2023-02-21 17:02:58 -08:00
Harrison Chase 047231840d add docs for chroma persistance (#1202) 2023-02-20 23:04:17 -08:00
Harrison Chase 5bdb8dd6fe Harrison/unstructured io (#1200) 2023-02-20 22:54:49 -08:00
Naveen Tatikonda 0118706fd6 Add Support for OpenSearch Vector database (#1191)
### Description
This PR adds a wrapper which adds support for the OpenSearch vector
database. Using opensearch-py client we are ingesting the embeddings of
given text into opensearch cluster using Bulk API. We can perform the
`similarity_search` on the index using the 3 popular searching methods
of OpenSearch k-NN plugin:

- `Approximate k-NN Search` use approximate nearest neighbor (ANN)
algorithms from the [nmslib](https://github.com/nmslib/nmslib),
[faiss](https://github.com/facebookresearch/faiss), and
[Lucene](https://lucene.apache.org/) libraries to power k-NN search.
- `Script Scoring` extends OpenSearch’s script scoring functionality to
execute a brute force, exact k-NN search.
- `Painless Scripting` adds the distance functions as painless
extensions that can be used in more complex combinations. Also, supports
brute force, exact k-NN search like Script Scoring.

### Issues Resolved 
https://github.com/hwchase17/langchain/issues/1054

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-02-20 18:39:34 -08:00
Harrison Chase 926c121b98 Harrison/text splitter docs (#1188) 2023-02-20 15:14:03 -08:00
Harrison Chase 91446a5e9b clean up text splitting docs (#1184) 2023-02-20 11:24:31 -08:00
Harrison Chase 4f3fbd7267 improve docs for indexes (#1146) 2023-02-19 23:14:50 -08:00