Commit Graph

26 Commits

Author SHA1 Message Date
Bagatur f7f3c02585 bump 287 (#10498) 2023-09-12 08:06:47 -07:00
Bagatur d2d11ccf63 bump 285 (#10373) 2023-09-08 08:26:31 -07:00
Bagatur 672907bbbb bump 284 (#10330) 2023-09-07 08:45:42 -07:00
Tomaz Bratanic db73c9d5b5 Diffbot Graph Transformer / Neo4j Graph document ingestion (#9979)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-09-06 13:32:59 -07:00
Bagatur 098b4aa465 bump 281 (#10189) 2023-09-04 08:51:50 -07:00
Bagatur 0e4c5dd176 bump 13 (#10130) 2023-09-02 10:22:31 -07:00
maks-operlejn-ds a8f804a618 Add data anonymizer (#9863)
### Description

The feature for anonymizing data has been implemented. In order to
protect private data, such as when querying external APIs (OpenAI), it
is worth pseudonymizing sensitive data to maintain full privacy.

Anonynization consists of two steps:

1. **Identification:** Identify all data fields that contain personally
identifiable information (PII).
2. **Replacement**: Replace all PIIs with pseudo values or codes that do
not reveal any personal information about the individual but can be used
for reference. We're not using regular encryption, because the language
model won't be able to understand the meaning or context of the
encrypted data.

We use *Microsoft Presidio* together with *Faker* framework for
anonymization purposes because of the wide range of functionalities they
provide. The full implementation is available in `PresidioAnonymizer`.

### Future works

- **deanonymization** - add the ability to reverse anonymization. For
example, the workflow could look like this: `anonymize -> LLMChain ->
deanonymize`. By doing this, we will retain anonymity in requests to,
for example, OpenAI, and then be able restore the original data.
- **instance anonymization** - at this point, each occurrence of PII is
treated as a separate entity and separately anonymized. Therefore, two
occurrences of the name John Doe in the text will be changed to two
different names. It is therefore worth introducing support for full
instance detection, so that repeated occurrences are treated as a single
object.

### Twitter handle
@deepsense_ai / @MaksOpp

---------

Co-authored-by: MaksOpp <maks.operlejn@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-08-30 10:39:44 -07:00
Bagatur d6957921f0 bump 276 (#9931) 2023-08-29 08:00:38 -07:00
Bagatur 9731ce5a40 bump 273 (#9751) 2023-08-25 03:05:04 -07:00
Predrag Gruevski eee0d1d0dd Update repository links in the package metadata. (#9454) 2023-08-18 12:55:43 -04:00
Bagatur a69d1b84f4 bump 267 (#9403) 2023-08-17 08:47:13 -07:00
Bagatur 5935767056 bump lc 246, lce 9 (#9207) 2023-08-14 08:14:37 -07:00
Harrison Chase 4d526c49ed bump experimental to 008 (#8490) 2023-07-30 07:28:18 -07:00
Harrison Chase 2448043b84 bump and fix (#8441) 2023-07-28 17:16:51 -07:00
Bagatur 61dd92f821 bump 246 (#8410) 2023-07-28 01:18:37 -07:00
Harrison Chase ae78ef7fe6 bump experimental to 005 (#8339) 2023-07-26 21:46:28 -07:00
Bagatur 5c6dcb1960 bump 243 (#8289) 2023-07-26 05:41:56 -07:00
Bagatur 82b8d8596c bump lc241 exp3 (#8193) 2023-07-24 11:52:44 -07:00
Bagatur 4928f7a9f5 undo bump (#8192) 2023-07-24 11:32:17 -07:00
Bagatur d5689d58ab Bagatur/bump 241 (#8182) 2023-07-24 07:47:40 -07:00
Harrison Chase 77bf75c236 bump experimental to 002 (#8150) 2023-07-23 09:22:39 -07:00
Harrison Chase 9f3073d418 bump versions (#8129) 2023-07-22 08:46:37 -07:00
Harrison Chase aa0e69bc98 Harrison/official pre release (#8106) 2023-07-21 18:44:32 -07:00
Harrison Chase 8dcabd9205 bump releases rc0 (#8097) 2023-07-21 13:54:57 -07:00
Harrison Chase d353d668e4 remove CVEs (#8092)
This PR aims to move all code with CVEs into `langchain.experimental`.
Note that we are NOT yet removing from the core `langchain` package - we
will give people a week to migrate here.

See MIGRATE.md for how to migrate

Zero changes to functionality

Vulnerabilities this addresses:

PALChain:
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5752409
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759265

SQLDatabaseChain
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759268

`load_prompt` (Python files only)
- https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5725807
2023-07-21 13:32:39 -07:00
Harrison Chase da04760de1 Harrison/move experimental (#8084) 2023-07-21 10:36:28 -07:00