improve documentation

This commit is contained in:
Jason
2023-09-13 13:39:09 -04:00
parent e025eaa64f
commit 61cff08909
17 changed files with 453 additions and 145 deletions
+137
View File
@@ -0,0 +1,137 @@
# Entity Resolution and Visualization for Legal Documents
In this guide, we demonstrate how to extract and resolve entities from a sample legal contract. Then, we visualize these entities and their dependencies as an entity graph. This approach can be invaluable for legal tech applications, aiding in the understanding of complex documents.
!!! tips "Motivation"
Legal contracts are full of intricate details and interconnected clauses. Automatically extracting and visualizing these elements can make it easier to understand the document's overall structure and terms.
## Defining the Data Structures
The **`Entity`** and **`Property`** classes model extracted entities and their attributes. **`DocumentExtraction`** encapsulates a list of these entities.
```python
from pydantic import BaseModel, Field
from typing import List
class Property(BaseModel):
key: str
value: str
resolved_absolute_value: str
class Entity(BaseModel):
id: int = Field(
...,
description="Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities",
)
subquote_string: List[str] = Field(
...,
description="Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution",
)
entity_title: str
properties: List[Property] = Field(
..., description="List of properties of the entity"
)
dependencies: List[int] = Field(
...,
description="List of entity ids that this entity depends or relies on to resolve it",
)
class DocumentExtraction(BaseModel):
entities: List[Entity] = Field(
...,
description="Body of the answer, each fact should be its seperate object with a body and a list of sources",
)
```
## Entity Extraction and Resolution
The **`ask_ai`** function utilizes OpenAI's API to extract and resolve entities from the input content.
```python
import openai
import instructor
instructor.patch()
def ask_ai(content) -> DocumentExtraction:
return openai.ChatCompletion.create(
model="gpt-4",
response_model=DocumentExtraction,
messages=[
{
"role": "system",
"content": "Extract and resolve a list of entities from the following document:",
},
{
"role": "user",
"content": content,
},
],
) # type: ignore
```
## Graph Visualization
**`generate_graph`** takes the extracted entities and visualizes them using Graphviz. It creates nodes for each entity and edges for their dependencies.
```python
from graphviz import Digraph
def generate_html_label(entity: Entity) -> str:
rows = [f"<tr><td>{prop.key}</td><td>{prop.resolved_absolute_value}</td></tr>" for prop in entity.properties]
table_rows = "".join(rows)
return f"<table border='0' cellborder='1' cellspacing='0'><tr><td colspan='2'><b>{entity.entity_title}</b></td></tr>{table_rows}</table>>"
def generate_graph(data: DocumentExtraction):
dot = Digraph(comment="Entity Graph", node_attr={"shape": "plaintext"})
for entity in data.entities:
label = generate_html_label(entity)
dot.node(str(entity.id), label)
for entity in data.entities:
for dep_id in entity.dependencies:
dot.edge(str(entity.id), str(dep_id))
dot.render("entity.gv", view=True)
```
## Execution
Finally, execute the code to visualize the entity graph for the sample legal contract.
```python
content = """
Sample Legal Contract
Agreement Contract
This Agreement is made and entered into on 2020-01-01 by and between Company A ("the Client") and Company B ("the Service Provider").
Article 1: Scope of Work
The Service Provider will deliver the software product to the Client 30 days after the agreement date.
Article 2: Payment Terms
The total payment for the service is $50,000.
An initial payment of $10,000 will be made within 7 days of the the signed date.
The final payment will be due 45 days after [SignDate].
Article 3: Confidentiality
The parties agree not to disclose any confidential information received from the other party for 3 months after the final payment date.
Article 4: Termination
The contract can be terminated with a 30-day notice, unless there are outstanding obligations that must be fulfilled after the [DeliveryDate].
""" # Your legal contract here
model = ask_ai(content)
generate_graph(model)
```
This will produce a graphical representation of the entities and their dependencies, stored as "entity.gv".
![Entity Graph](entity_resolution.png)
Binary file not shown.

After

Width:  |  Height:  |  Size: 266 KiB

+6 -1
View File
@@ -6,13 +6,18 @@
- [Self-Assessment via Validators](self_critique.md): Implement AI self-assessment with `llm_validator`.
- [Citations via Regex](exact_citations.md): Retrieve exact citations using regular expressions and smart prompting.
- [Extracting Search Queries](search.md): Segment search queries through function calling and multi-task definitions.
- [Generating Knowledge Graphs](knowledge_graph.md): Generate knowledge graphs from a question
- [Query Decomposition](planning-tasks.md): Decompose complex queries into subqueries in a single request.
- [Entity Extraction and Resolution](entity_resolution.md): Extract and resolve entities from a document.
- [Working with Recursive Schemas](recursive.md): Implement and understand recursive schemas.
- [Citations via Regex](exact_citations.md): Retrieve exact citations using regular expressions and smart prompting.
- [Table Extraction from Text](autodataframe.md): Extract tables, potentially multiple, automatically from textual data.
+87
View File
@@ -0,0 +1,87 @@
# Visualizing Knowledge Graphs for Complex Topics
In this guide, you'll discover how to visualize a detailed knowledge graph for understanding complex topics, in this case, quantum mechanics. We leverage OpenAI's API and the Graphviz library to bring structure to intricate subjects.
!!! tips "Motivation"
Knowledge graphs offer a visually appealing and coherent way to understand complicated topics like quantum mechanics. By generating these graphs automatically, you can accelerate the learning process and make it easier to digest complex information.
## Defining the Structures
Let's model a knowledge graph with **`Node`** and **`Edge`** objects. **`Node`** objects represent key concepts or entities, while **`Edge`** objects indicate the relationships between them.
```python
from pydantic import BaseModel, Field
from typing import List
class Node(BaseModel):
id: int
label: str
color: str
class Edge(BaseModel):
source: int
target: int
label: str
color: str = "black"
class KnowledgeGraph(BaseModel):
nodes: List[Node] = Field(..., default_factory=list)
edges: List[Edge] = Field(..., default_factory=list)
```
## Generating Knowledge Graphs
The **`generate_graph`** function leverages OpenAI's API to generate a knowledge graph based on the input query.
```python
import openai
def generate_graph(input) -> KnowledgeGraph:
return openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": f"Help me understand the following by describing it as a detailed knowledge graph: {input}",
}
],
response_model=KnowledgeGraph,
) # type: ignore
```
## Visualizing the Graph
The **`visualize_knowledge_graph`** function uses the Graphviz library to render the generated knowledge graph.
```python
from graphviz import Digraph
def visualize_knowledge_graph(kg: KnowledgeGraph):
dot = Digraph(comment="Knowledge Graph")
# Add nodes
for node in kg.nodes:
dot.node(str(node.id), node.label, color=node.color)
# Add edges
for edge in kg.edges:
dot.edge(str(edge.source), str(edge.target), label=edge.label, color=edge.color)
# Render the graph
dot.render("knowledge_graph.gv", view=True)
```
## Putting It All Together
Execute the code to generate and visualize a knowledge graph for understanding quantum mechanics.
```python
graph: KnowledgeGraph = generate_graph("Teach me about quantum mechanics")
visualize_knowledge_graph(graph)
```
![Knowledge Graph](knowledge_graph.png)
This will produce a visual representation of the knowledge graph, stored as "knowledge_graph.gv". You can open this file to explore the key concepts and their relationships in quantum mechanics.
By leveraging automated knowledge graphs, you can dissect complex topics into digestible pieces, making the learning journey less daunting and more effective.
Binary file not shown.

After

Width:  |  Height:  |  Size: 241 KiB