Commit Graph

1727 Commits

Author SHA1 Message Date
Steve Kim 9b830f437c Deleted importing Document from document_loaders.base because Documen… (#4068)
Hi,

- Modification:
https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/arxiv.html
- Reason: In this example, the first line is unnecessary because the
Document class does not exist in the base.
- Resolves: Issue #4052

--------
P.S: This pull-request is my first time, so please let me know if I need
to correct or write more explanation.
2023-05-03 17:54:30 -07:00
hp0404 374725a715 Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863)
This PR includes two main changes:

- Refactor the `TelegramChatLoader` and `FacebookChatLoader` classes by
removing the dependency on pandas and simplifying the message filtering
process.

- Add test cases for the `TelegramChatLoader` and `FacebookChatLoader`
classes. This test ensures that the class correctly loads and processes
the example chat data, providing better test coverage for this
functionality.
2023-05-03 15:59:19 -07:00
Jon Saginaw ea64b1716d Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797)
The Blockchain Document Loader's default behavior is to return 100
tokens at a time which is the Alchemy API limit. The Document Loader
exposes a startToken that can be used for pagination against the API.

This enhancement includes an optional get_all_tokens param (default:
False) which will:

- Iterate over the Alchemy API until it receives all the tokens, and
return the tokens in a single call to the loader.
- Manage all/most tokenId formats (this can be int, hex16 with zero or
all the leading zeros). There aren't constraints as to how smart
contracts can represent this value, but these three are most common.

Note that a contract with 10,000 tokens will issue 100 calls to the
Alchemy API, and could take about a minute, which is why this param will
default to False. But I've been using the doc loader with these
utilities on the side, so figured it might make sense to build them in
for others to use.
2023-05-03 15:46:44 -07:00
Akash Sharma 525db1b6cb Fixed typo leading to broken link (#4034) 2023-05-03 14:45:54 -07:00
Zander Chase afa9d1292b Re-Permit Partials in Tool (#4058)
Resolved issue #4053

Now that StructuredTool is a separate class, this constraint is no
longer needed.

Added/updated a unit test
2023-05-03 13:16:41 -07:00
Zander Chase 7e967aa4d5 Update Notebooks (#4051) 2023-05-03 09:31:02 -07:00
Nuno Campos f3ec6d2449 Replace remaining usage of basellm with baselangmodel (#3981) 2023-05-02 21:52:29 -07:00
mbchang f291fd7eed docs: remove stdout from pip install (for gymnasium) (#3993) 2023-05-02 21:51:40 -07:00
Harrison Chase b67be55ab8 bump ver (#4018) 2023-05-02 19:02:02 -07:00
Harrison Chase a5dd73c1a6 Revert "[agent][property type] Change allowed_tools to Set as Duplicate doesn’t make sense" (#4014)
Reverts hwchase17/langchain#3840
2023-05-02 18:58:05 -07:00
Davis Chase df3bc707fc Dev2049/callback example fix (#4010)
Closes #3997

---------

Co-authored-by: Akshaj Jain <akshaj.jain@gmail.com>
2023-05-02 16:20:16 -07:00
Davis Chase f08a76250f Better custom model handling OpenAICallbackHandler (#4009)
Thanks @maykcaldas for flagging! think this should resolve #3988. Let me
know if you still see issues after next release.
2023-05-02 16:19:57 -07:00
Zander Chase aa38355999 Vwp/docs improved document loaders (#4006)
Huge thanks to @leo-gan for improving the document loaders notebooks

---------

Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
2023-05-02 15:24:53 -07:00
Zander Chase 1c68cbdb28 Fix typing of attribute (#3999) 2023-05-02 15:11:23 -07:00
MichaelMDowling 36ee60c96c Update \docs\modules\models\text_embedding\examples\openai.ipynb (#3976)
Single edit to: models/text_embedding/examples/openai.ipynb - Line 88:
changed from: "embeddings = OpenAIEmbeddings(model_name=\"ada\")" to
"embeddings = OpenAIEmbeddings()" as model_name is no longer part of the
OpenAIEmbeddings class.
2023-05-02 14:41:31 -07:00
Harrison Chase e23391965b fix import (#4003) 2023-05-02 14:26:46 -07:00
Jinto Jose 013208cce6 Fix Documentation - Nomic - Atlas Jupyter Notebook (#3987)
Correction to Numic-Atlas Jupyter Notebook Docs
2023-05-02 14:20:01 -07:00
Ankush Gola 18f9d7b4f6 don't deepcopy handlers (#3995)
Co-authored-by: Sami Liedes <sami.liedes@iki.fi>
Co-authored-by: Sami Liedes <sami.liedes@rocket-science.ch>
2023-05-02 13:53:27 -07:00
Mike Wang c26cf04110 [check] add import check and warning for pandas (#3944)
- as titled, add an `import` catch for pandas with a user suggestion
message.
2023-05-02 10:08:16 -07:00
Chop Tr 71a337dac6 Update output_fixing_parser.ipynb (#3978) 2023-05-02 09:33:46 -07:00
Ankush Gola 3bd5a99b83 v2 tracer with single runs endpoint (#3951) 2023-05-01 22:41:32 -07:00
Harrison Chase 8fcb56e74a bump version to 155 (#3943) 2023-05-01 22:05:52 -07:00
Harrison Chase ca08a34a98 retry to parsing (#3696) 2023-05-01 22:05:42 -07:00
mbchang 3993166b5e docs: remove stdout from pip install (#3945) 2023-05-01 22:05:22 -07:00
Harrison Chase 2366e71bed Harrison/azure openai (#3942)
Co-authored-by: Saverio Proto <zioproto@gmail.com>
2023-05-01 21:34:16 -07:00
Harrison Chase 48ea27ba60 Harrison/blockwise sitemap (#3940)
Co-authored-by: Martin Holzhauer <martin@holzhauer.eu>
2023-05-01 21:34:07 -07:00
Harrison Chase 483fe257d9 bump timeout (#3939) 2023-05-01 21:33:57 -07:00
Jan Philipp Harries fc3c2c4406 Async Support for LLMChainExtractor (new) (#3780)
@vowelparrot @hwchase17 Here a new implementation of
`acompress_documents` for `LLMChainExtractor ` without changes to the
sync-version, as you suggested in #3587 / [Async Support for
LLMChainExtractor](https://github.com/hwchase17/langchain/pull/3587) .

I created a new PR to avoid cluttering history with reverted commits,
hope that is the right way.
Happy for any improvements/suggestions.

(PS:
I also tried an alternative implementation with a nested helper function
like

``` python
  async def acompress_documents_old(
      self, documents: Sequence[Document], query: str
  ) -> Sequence[Document]:
      """Compress page content of raw documents."""
      async def _compress_concurrently(doc):
          _input = self.get_input(query, doc)
          output = await self.llm_chain.apredict_and_parse(**_input)
          return Document(page_content=output, metadata=doc.metadata)
      outputs=await asyncio.gather(*[_compress_concurrently(doc) for doc in documents])
      compressed_docs=list(filter(lambda x: len(x.page_content)>0,outputs))
      return compressed_docs
```

But in the end I found the commited version to be better readable and
more "canonical" - hope you agree.
2023-05-01 21:23:13 -07:00
Harrison Chase 2cecc572f9 Harrison/chroma get (#3938)
Co-authored-by: sdan <git@sdan.io>
2023-05-01 21:19:28 -07:00
liviuasnash1 6396a4ad8d Fix documentation typos (#3870)
Co-authored-by: Liviu Asnash <liviua@maximallearning.com>
2023-05-01 20:58:38 -07:00
Hristo Stoychev 109927cdb2 Make project compatible with SQLAlchemy 1.3.* (#3862)
Related to [this
issue.](https://github.com/hwchase17/langchain/issues/3655#issuecomment-1529415363)

The `Mapped` SQLAlchemy class is introduced in SQLAlchemy 1.4 but the
migration from 1.3 to 1.4 is quite challenging so, IMO, it's better to
keep backwards compatibility and not change the SQLAlchemy requirements
just because of type annotations.
2023-05-01 20:58:22 -07:00
sqr 8bbdde8f9e make ARG POETRY_HOME available in multistage (#3882) 2023-05-01 20:57:41 -07:00
玄猫 188a7bd653 fix: pgvector hang risk if table not exist #3883 (#3884) 2023-05-01 20:57:31 -07:00
tomer555 9acf80fd69 fix: invalid escape sequence error in regex pattern (#3902)
This PR fixes the "SyntaxError: invalid escape sequence" error in the
pydantic.py file. The issue was caused by the backslashes in the regular
expression pattern being treated as escape characters. By using a raw
string literal for the regex pattern (e.g., r"\{.*\}"), this fix ensures
that backslashes are treated as literal characters, thus preventing the
error.

Co-authored-by: Tomer Levy <tomer.levy@tipalti.com>
2023-05-01 20:57:19 -07:00
Samuel Dion-Girardeau c5c33786a7 Fix bad spellings for 'convenience' (#3936)
Found in the docs for chat prompt templates:

https://python.langchain.com/en/latest/getting_started/getting_started.html#chat-prompt-templates

and fixed similar issues in neighboring notebooks.
2023-05-01 20:57:06 -07:00
Harrison Chase f04faf8496 Harrison/spreedly (#3937)
Co-authored-by: Esmit Pérez <esmitperez@users.noreply.github.com>
2023-05-01 20:56:56 -07:00
Harrison Chase cd3f8582cb Harrison/combined memory (#3935)
Co-authored-by: engkheng <60956360+outday29@users.noreply.github.com>
2023-05-01 20:55:56 -07:00
Zander Chase c4cb55a0c5 [Breaking] Migrate GPT4All to use PyGPT4All (#3934)
Seems the pyllamacpp package is no longer the supported bindings from
gpt4all. Tested that this works locally.

Given that the older models weren't very performant, I think it's better
to migrate now without trying to include a lot of try / except blocks

---------

Co-authored-by: Nissan Pow <npow@users.noreply.github.com>
Co-authored-by: Nissan Pow <pownissa@amazon.com>
2023-05-01 20:42:45 -07:00
leo-gan f0a4bbb8e2 updated YouTube links (#3916)
Added several links to fresh videos

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-05-01 20:39:59 -07:00
Mike Wang 68a18cc621 [simple] add ddg-search to __init__ for easier loading (#3933)
the same as other tools
2023-05-01 20:39:17 -07:00
Matt Robinson c51dec5101 feat: add Unstructured API loaders (#3906)
### Summary

Adds `UnstructuredAPIFileLoaders` and `UnstructuredAPIFIleIOLoaders`
that partition documents through the Unstructured API. Defaults to the
URL for hosted Unstructured API, but can switch to a self hosted or
locally running API using the `url` kwarg. Currently, the Unstructured
API is open and does not require an API, but it will soon. A note was
added about that to the Unstructured ecosystem page.

### Testing


```python
from langchain.document_loaders import UnstructuredAPIFileIOLoader

filename = "fake-email.eml"

with open(filename, "rb") as f:
    loader = UnstructuredAPIFileIOLoader(file=f, file_filename=filename)
    docs = loader.load()

docs[0]
```

```python
from langchain.document_loaders import UnstructuredAPIFileLoader

filename = "fake-email.eml"
loader = UnstructuredAPIFileLoader(file_path=filename, mode="elements")
docs = loader.load()

docs[0]
```
2023-05-01 20:37:35 -07:00
Harrison Chase 13269fb583 Harrison/relevancy score (#3907)
Co-authored-by: Ryan Grippeling <R.Grippeling@hotmail.com>
Co-authored-by: Ryan <ryan@webgrip.nl>
Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
2023-05-01 20:37:24 -07:00
Zander Chase c582f2e9e3 Add Structure Chat Agent (#3912)
Create a new chat agent that is compatible with the Multi-input tools
2023-05-01 20:34:50 -07:00
Mike Wang ec21b7126c [agent][property type] Change allowed_tools to Set as Duplicate doesn’t make sense (#3840)
- ActionAgent has a property called, `allowed_tools`, which is declared
as `List`. It stores all provided tools which is available to use during
agent action.
- This collection shouldn’t allow duplicates. The original datatype List
doesn’t make sense. Each tool should be unique. Even when there are
variants (assuming in the future), it would be named differently in
load_tools.


Test:
- confirm the functionality in an example by initializing an agent with
a list of 2 tools and confirm everything works.
```python3
def test_agent_chain_chat_bot():
	from langchain.agents import load_tools
	from langchain.agents import initialize_agent
	from langchain.agents import AgentType
	from langchain.chat_models import ChatOpenAI
	from langchain.llms import OpenAI
	from langchain.utilities.duckduckgo_search import DuckDuckGoSearchAPIWrapper

	chat = ChatOpenAI(temperature=0)
	llm = OpenAI(temperature=0)
	tools = load_tools(["ddg-search", "llm-math"], llm=llm)

	agent = initialize_agent(tools, chat, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
	agent.run("Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?")
test_agent_chain_chat_bot()
```
Result:
<img width="863" alt="Screenshot 2023-05-01 at 7 58 11 PM"
src="https://user-images.githubusercontent.com/62768671/235572157-0937594c-ddfb-4760-acb2-aea4cacacd89.png">
2023-05-01 20:30:10 -07:00
Harrison Chase c5cc09d4e3 Harrison/agent exec kwargs (#3917)
Co-authored-by: Zach Schillaci <40636930+zachschillaci27@users.noreply.github.com>
2023-05-01 20:28:43 -07:00
Harrison Chase 05170b6764 Harrison/from documents (#3919)
Co-authored-by: Gabriel Altay <gabriel.altay@gmail.com>
2023-05-01 20:28:14 -07:00
Davis Chase e7e29f9937 Dev2049/add modern treasury (#3924)
Modified Modern Treasury and Strip slightly so credentials don't have to
be passed in explicitly. Thanks @mattgmarcus for adding Modern Treasury!

---------

Co-authored-by: Matt Marcus <matt.g.marcus@gmail.com>
2023-05-01 20:28:02 -07:00
Davis Chase 5db6b796cf Dev2049/hf emb encode kwargs (#3925)
Thanks @amogkam for the addition! Refactored slightly

---------

Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2023-05-01 20:27:41 -07:00
mbchang ffc87233a1 refactor GymnasiumAgent (#3927)
refactor GymnasiumAgent (for single-agent environments) to be extensible
to PettingZooAgent (multi-agent environments)
2023-05-01 20:25:03 -07:00
mbchang 81601d886c new example: multi-agent simulations with environment (#3928) 2023-05-01 20:24:15 -07:00