langchain

mirror of https://github.com/kennethreitz/langchain.git synced 2026-06-05 23:00:18 +00:00

Author	SHA1	Message	Date
Harrison Chase	f95cedc443	Harrison/sql rows (#915 ) Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>	2023-02-06 18:56:18 -08:00
Harrison Chase	ba5a2f06b9	Harrison/inference endpoint (#861 ) Co-authored-by: Eno Reyes <enoreyes@gmail.com>	2023-02-06 18:14:25 -08:00
Harrison Chase	2ec25ddd4c	add unstructured examples (#913 )	2023-02-06 18:13:46 -08:00
Kevin Huo	31b054f69d	Add pinecone integration test (#911 ) Basic integration test for pinecone	2023-02-06 18:13:35 -08:00
Harrison Chase	93a091cfb8	Optionally return shell output on incorrect command (#894 ) (#899 ) This allows the LLM to correct its previous command by looking at the error message output to the shell. Additionally, this uses subprocess.run because that is now recommended over subprocess.check_output: https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module Co-authored-by: Amos Ng <me@amos.ng>	2023-02-06 12:46:16 -08:00
James Briggs	3aa53b44dd	added i_end in batch extraction (#907 ) Fix for issue #906 Switches `[i : i + batch_size]` to `[i : i_end]` in Pinecone `from_texts` method	2023-02-06 12:45:56 -08:00
Harrison Chase	82c080c6e6	bump version to 0078 (#908 )	2023-02-06 00:32:44 -08:00
Harrison Chase	71e662e88d	update docs (#905 )	2023-02-06 00:26:20 -08:00
Harrison Chase	53d56d7650	Harrison/unstructured support (#903 )	2023-02-05 23:02:07 -08:00
Harrison Chase	2a68be3e8d	chat vector db chain (#902 )	2023-02-05 21:38:47 -08:00
James Briggs	8217a2f26c	Update pinecone init details in docs (#898 ) PR to fix outdated environment details in the docs, see issue #897 I added code comments as pointers to where users go to get API keys, and where they can find the relevant environment variable.	2023-02-05 15:21:56 -08:00
Bagatur	7658263bfb	Check type of LLM.generate `prompts` arg (#886 ) Was passing prompt in directly as string and getting nonsense outputs. Had to inspect source code to realize that first arg should be a list. Could be nice if there was an explicit error or warning, seems like this could be a common mistake.	2023-02-04 22:49:17 -08:00
Samantha Whitmore	32b11101d3	Get elements of ActionInput on newlines (#889 ) The re.DOTALL flag in Python's re (regular expression) module makes the . (dot) metacharacter match newline characters as well as any other character. Without re.DOTALL, the . metacharacter only matches any character except for a newline character. With re.DOTALL, the . metacharacter matches any character, including newline characters.	2023-02-04 20:42:25 -08:00
Harrison Chase	1614c5f5fd	fix flaky tests (#892 )	2023-02-04 20:41:33 -08:00
Harrison Chase	a2b699dcd2	prompt template from string (#884 )	2023-02-04 17:04:58 -08:00
Alex	7cc44b3bdb	Add to gallery (#882 )	2023-02-04 09:45:20 -08:00
Harrison Chase	0b9f086d36	Harrison/docs splitter (#879 )	2023-02-03 15:09:13 -08:00
Harrison Chase	bcfbc7a818	version 0077 (#878 )	2023-02-03 14:49:52 -08:00
Ryan Walker	1dd0733515	Fix small typo in getting started docs (#876 ) Just noticed this little typo while reading the docs, thought I'd open a PR!	2023-02-03 14:22:12 -08:00
Zach Schillaci	4c79100b15	Correct prompt typo + update example for SQLDatabaseChain (#868 ) See https://github.com/hwchase17/langchain/issues/821	2023-02-03 08:34:41 -08:00
Harrison Chase	777aaff841	fix routing to tiktoken encoder (#866 )	2023-02-02 22:08:14 -08:00
Harrison Chase	e9ef08862d	validate template (#865 )	2023-02-02 22:08:01 -08:00
Harrison Chase	364b771743	sql return direct (#864 )	2023-02-02 22:07:41 -08:00
Harrison Chase	483441d305	pass kwargs through to loading (#863 )	2023-02-02 22:07:26 -08:00
Harrison Chase	8df6b68093	fix length based example selector (#862 )	2023-02-02 22:06:56 -08:00
Harrison Chase	3f48eed5bd	Harrison/milvus (#856 ) Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com> Signed-off-by: Frank Liu <frank.liu@zilliz.com> Co-authored-by: Filip Haltmayer <81822489+filip-halt@users.noreply.github.com> Co-authored-by: Frank Liu <frank@frankzliu.com>	2023-02-02 22:05:47 -08:00
Ankush Gola	933441cc52	Add retry to OpenAI llm (#849 ) add ability to retry when certain exceptions are raised by `openai.Completions.create` Test plan: ran all OpenAI integration tests.	2023-02-02 19:56:26 -08:00
kahkeng	4a8f5cdf4b	Add alternative token-based text splitter (#816 ) This does not involve a separator, and will naively chunk input text at the appropriate boundaries in token space. This is helpful if we have strict token length limits that we need to strictly follow the specified chunk size, and we can't use aggressive separators like spaces to guarantee the absence of long strings. CharacterTextSplitter will let these strings through without splitting them, which could cause overflow errors downstream. Splitting at arbitrary token boundaries is not ideal but is hopefully mitigated by having a decent overlap quantity. Also this results in chunks which has exact number of tokens desired, instead of sometimes overcounting if we concatenate shorter strings. Potentially also helps with #528.	2023-02-02 19:55:13 -08:00
Harrison Chase	523ad2e6bd	vercel deployments (#850 )	2023-02-02 19:54:09 -08:00
Harrison Chase	fc0cfd7d1f	docs (#848 )	2023-02-02 11:35:36 -08:00
Harrison Chase	4d32441b86	bump version to 0076 (#847 )	2023-02-02 10:05:39 -08:00
Harrison Chase	23d5f64bda	Harrison/ngram example (#846 ) Co-authored-by: Sean Spriggens <ssprigge@syr.edu>	2023-02-02 09:44:42 -08:00
Harrison Chase	0de55048b7	return code for pal (#844 )	2023-02-02 08:47:20 -08:00
Harrison Chase	d564308e0f	rfc: instruct embeddings (#811 ) Co-authored-by: seanaedmiston <seane999@gmail.com>	2023-02-02 08:44:02 -08:00
Nick Furlotte	576609e665	Update PAL to allow passing local and global context to PythonREPL (#774 ) Passing additional variables to the python environment can be useful for example if you want to generate code to analyze a dataset. I also added a tracker for the executed code - `code_history`.	2023-02-02 08:34:23 -08:00
Harrison Chase	3f952eb597	add from string method (#820 )	2023-02-02 08:23:54 -08:00
Ikko Eltociear Ashimine	ba26a879e0	Fix typo in crawler.py (#842 ) seperator -> separator	2023-02-02 08:23:38 -08:00
Eli Mernit	bfabd1d5c0	Added new deployment template (#835 ) This PR introduces a new template for deploying LangChain apps as web endpoints. It includes template code, and links to a detailed code-walkthrough.	2023-02-01 23:38:36 -08:00
Jonas Ehrenstein	f3508228df	Minor fix for google search util: it's uncertain if "snippet" in results exists (#830 ) The results from Google search may not always contain a "snippet". Example: `{'kind': 'customsearch#result', 'title': 'FEMA Flood Map', 'htmlTitle': 'FEMA Flood Map', 'link': 'https://msc.fema.gov/portal/home', 'displayLink': 'msc.fema.gov', 'formattedUrl': 'https://msc.fema.gov/portal/home', 'htmlFormattedUrl': 'https://<b>msc</b>.fema.gov/portal/home'}` This will cause a KeyError at line 99 `snippets.append(result["snippet"])`.	2023-02-01 23:37:52 -08:00
Zach Schillaci	b4eb043b81	Minor fix to SQLDatabaseChain doc (#826 )	2023-02-01 23:37:38 -08:00
Istora Mandiri	06438794e1	Fix typo in textsplitter docs (#825 )	2023-02-01 23:32:35 -08:00
Raza Habib	9f8e05ffd4	Update __init__.py (#827 ) Remove duplicate APIChain	2023-02-01 23:31:38 -08:00
Harrison Chase	b0d560be56	add to gallery (#824 )	2023-02-01 07:10:15 -08:00
Johanna Appel	ebea40ce86	Add 'truncate' parameter for CohereEmbeddings (#798 ) Currently, the 'truncate' parameter of the cohere API is not supported. This means that by default, if trying to generate and embedding that is too big, the call will just fail with an error (which is frustrating if using this embedding source e.g. with GPT-Index, because it's hard to handle it properly when generating a lot of embeddings). With the parameter, one can decide to either truncate the START or END of the text to fit the max token length and still generate an embedding without throwing the error. In this PR, I added this parameter to the class. _Arguably, there should be a better way to handle this error, e.g. by optionally calling a function or so that gets triggered when the token limit is reached and can split the document or some such. Especially in the use case with GPT-Index, its often hard to estimate the token counts for each document and I'd rather sort out the troublemakers or simply split them than interrupting the whole execution. Thoughts?_ --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-02-01 07:09:03 -08:00
Harrison Chase	b9045f7e0d	bump version to 0075 (#819 )	2023-01-31 00:18:32 -08:00
Harrison Chase	7b4882a2f4	Harrison/tf embeddings (#817 ) Co-authored-by: Ryohei Kuroki <10434946+yakigac@users.noreply.github.com>	2023-01-31 00:00:08 -08:00
Harrison Chase	5d4b6e4d4e	conversational agent fix (#818 )	2023-01-30 23:59:55 -08:00
Harrison Chase	94ae126747	return sql intermediate steps (#792 )	2023-01-30 15:10:48 -08:00
bair82	ae5695ad32	Update cohere.py (#795 ) When stop tokens are set in Cohere LLM constructor, they are currently not stripped from the response, and they should be stripped	2023-01-30 14:55:44 -08:00
Johanna Appel	cacf4091c0	Fix documentation for 'model' parameter in CohereEmbeddings (#797 ) Currently, the class parameter 'model_name' of the CohereEmbeddings class is not supported, but 'model' is. The class documentation is inconsistent with this, though, so I propose to either fix the documentation (this PR right now) or fix the parameter. It will create the following error: ``` ValidationError: 1 validation error for CohereEmbeddings model_name extra fields not permitted (type=value_error.extra) ```	2023-01-30 14:55:08 -08:00

1 2 3 4 5 ...

509 Commits