mirror of
https://github.com/kennethreitz/instructor.git
synced 2026-06-05 22:50:18 +00:00
Examples of using LLMs for citation verification (#192)
This commit is contained in:
+4
-3
@@ -4,9 +4,10 @@ The goal of the blog is to capture some content that does not neatly fit within
|
||||
|
||||
## Advanced Topics
|
||||
|
||||
- [Query Understanding and Expansion for RAG](posts/rag-and-beyond.md)
|
||||
- [GPT-4 Level summarization with GPT3.5 Finetuning](posts/chain-of-density.md)
|
||||
- [Deepdive on LLM Guardrails / Validation](posts/validation-part1.md)
|
||||
- [Query Understanding for RAG: Beyond Embeddings](posts/rag-and-beyond.md)
|
||||
- [Finetuning: GPT-4 level summaries with GPT-3.5-turbo](posts/chain-of-density.md)
|
||||
- [Introduction to Guardrails and Validation](posts/validation-part1.md)
|
||||
- [Validating Citations](posts/citations.md)
|
||||
- [A Guide to Fine-Tuning and Distillation](posts/distilation-part1.md)
|
||||
|
||||
## Learning Python
|
||||
|
||||
@@ -0,0 +1,268 @@
|
||||
---
|
||||
draft: False
|
||||
date: 2023-11-18
|
||||
slug: validate-citations
|
||||
tags:
|
||||
- pydantic
|
||||
- validation
|
||||
- finetuneing
|
||||
- citations
|
||||
- hallucination
|
||||
authors:
|
||||
- jxnl
|
||||
---
|
||||
|
||||
# Verifying LLM Citations with Pydantic
|
||||
|
||||
Ensuring the accuracy of information is crucial. This blog post explores how Pydantic's powerful and flexible validators can enhance data accuracy through citation verification.
|
||||
|
||||
We'll start with using a simple substring check to verify citations. Then we'll use `instructor` itself to power an LLM to verify citations and align answers with the given citations. Finally, we'll explore how we can use these techniques to generate a dataset of accurate responses.
|
||||
|
||||
## Example 1: Simple Substring Check
|
||||
|
||||
In this example, we use the `Statements` class to verify if a given substring quote exists within a text chunk. If the substring is not found, an error is raised.
|
||||
|
||||
### Code Example:
|
||||
|
||||
```python
|
||||
from typing import List, Optional
|
||||
from openai import OpenAI
|
||||
from pydantic import BaseModel, Field, ValidationError, ValidationInfo, field_validator, model_validator
|
||||
import instructor
|
||||
|
||||
client = instructor.patch(OpenAI())
|
||||
|
||||
class Statements(BaseModel):
|
||||
body: str
|
||||
substring_quote: str
|
||||
|
||||
@field_validator("substring_quote")
|
||||
@classmethod
|
||||
def substring_quote_exists(cls, v: str, info: ValidationInfo):
|
||||
context = info.context.get("text_chunks", None)
|
||||
|
||||
for text_chunk in context.values():
|
||||
if v in text_chunk: # (1)
|
||||
return v
|
||||
raise ValueError("Could not find substring_quote `{v}` in contexts")
|
||||
|
||||
|
||||
class AnswerWithCitaton(BaseModel):
|
||||
question: str
|
||||
answer: List[Statements]
|
||||
```
|
||||
|
||||
1. While we use a simple substring check in this example, we can use more complex techniques like regex or Levenshtein distance.
|
||||
|
||||
Once the class is defined, we can use it to validate the context and raise an error if the substring is not found.
|
||||
|
||||
```python
|
||||
try:
|
||||
AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Paris", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is not the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
```
|
||||
|
||||
### Error Message Example:
|
||||
|
||||
```
|
||||
answer.0.substring_quote
|
||||
Value error, Could not find substring_quote `Paris is the capital of France` in contexts [type=value_error, input_value='Paris is the capital of France', input_type=str]
|
||||
For further information visit [https://errors.pydantic.dev/2.4/v/value_error](https://errors.pydantic.dev/2.4/v/value_error)
|
||||
```
|
||||
|
||||
Pydantic raises a validation error when the `substring_quote` attribute does not exist in the context. This approach can be used to validate more complex data using techniques like regex or Levenshtein distance.
|
||||
|
||||
## Example 2: Using LLM for Verification
|
||||
|
||||
This approach leverages OpenAI's LLM to validate citations. If the citation does not exist in the context, the LLM returns an error message.
|
||||
|
||||
### Code Example:
|
||||
|
||||
```python
|
||||
class Validation(BaseModel):
|
||||
is_valid: bool
|
||||
error_messages: Optional[str] = Field(None, description="Error messages if any")
|
||||
|
||||
|
||||
class Statements(BaseModel):
|
||||
body: str
|
||||
substring_quote: str
|
||||
|
||||
@model_validator(mode="after")
|
||||
def substring_quote_exists(self, info: ValidationInfo):
|
||||
context = info.context.get("text_chunks", None)
|
||||
|
||||
resp: Validation = client.chat.completions.create(
|
||||
response_model=Validation,
|
||||
messages=[
|
||||
{
|
||||
"role": "user",
|
||||
"content": f"Does the following citation exist in the following context?\n\nCitation: {self.substring_quote}\n\nContext: {context}",
|
||||
}
|
||||
],
|
||||
model="gpt-3.5-turbo",
|
||||
)
|
||||
|
||||
if resp.is_valid:
|
||||
return self
|
||||
|
||||
raise ValueError(resp.error_messages)
|
||||
|
||||
|
||||
class AnswerWithCitaton(BaseModel):
|
||||
question: str
|
||||
answer: List[Statements]
|
||||
```
|
||||
|
||||
Now when we use a correct citation, the LLM returns a valid response.
|
||||
|
||||
```python
|
||||
resp = AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Paris", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
print(resp.model_dump_json(indent=2))
|
||||
```
|
||||
|
||||
### Result:
|
||||
|
||||
```json
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{
|
||||
"body": "Paris",
|
||||
"substring_quote": "Paris is the capital of France"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
When we have citations that don't exist in the context, the LLM returns an error message.
|
||||
|
||||
```python
|
||||
try:
|
||||
AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Paris", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is not the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
```
|
||||
|
||||
### Error Message Example:
|
||||
|
||||
```
|
||||
1 validation error for AnswerWithCitaton
|
||||
answer.0
|
||||
Value error, Citation not found in context [type=value_error, input_value={'body': 'Paris', 'substr... the capital of France'}, input_type=dict]
|
||||
For further information visit [https://errors.pydantic.dev/2.4/v/value_error](https://errors.pydantic.dev/2.4/v/value_error)
|
||||
```
|
||||
|
||||
## Example 3: Aligning Citations and Answers
|
||||
|
||||
In this example, we ensure that the provided answers are aligned with the given citations and context. The LLM is used to verify the alignment.
|
||||
|
||||
We use the same `Statements` model as above, but we add a new model for the answer that also verifies the alignment of citations.
|
||||
|
||||
### Code Example:
|
||||
|
||||
```python
|
||||
class AnswerWithCitaton(BaseModel):
|
||||
question: str
|
||||
answer: List[Statements]
|
||||
|
||||
@model_validator(mode="after")
|
||||
def validate_answer(self, info: ValidationInfo):
|
||||
context = info.context.get("text_chunks", None)
|
||||
|
||||
resp: Validation = client.chat.completions.create(
|
||||
response_model=Validation,
|
||||
messages=[
|
||||
{
|
||||
"role": "user",
|
||||
"content": f"Does the following answers match the question and the context?\n\nQuestion: {self.question}\n\nAnswer: {self.answer}\n\nContext: {context}",
|
||||
}
|
||||
],
|
||||
model="gpt-3.5-turbo",
|
||||
)
|
||||
|
||||
if resp.is_valid:
|
||||
return self
|
||||
|
||||
raise ValueError(resp.error_messages)
|
||||
```
|
||||
|
||||
When we have a mismatch between the answer and the citation, the LLM returns an error message.
|
||||
|
||||
```python
|
||||
try:
|
||||
AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Texas", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
```
|
||||
|
||||
### Error Message Example:
|
||||
|
||||
```
|
||||
1 validation error for AnswerWithCitaton
|
||||
Value error, The answer does not match the question and context [type=value_error, input_value={'question': 'What is the...he capital of France'}]}, input_type=dict]
|
||||
For further information visit [https://errors.pydantic.dev/2.4/v/value_error](https://errors.pydantic.dev/2.4/v/value_error)
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
These examples demonstrate the potential of using Pydantic and OpenAI to enhance data accuracy through citation verification. While the LLM-based approach may not be efficient for runtime operations, it has exciting implications for generating a dataset of accurate responses. By leveraging this method during data generation, we can fine-tune a model that excels in citation accuracy. Similar to our last post on [finetuning a better summarizer](https://jxnl.github.io/instructor/blog/2023/11/05/chain-of-density/).
|
||||
|
||||
If you like the content check out our [GitHub](https://github.com/jxnl/instructor) as give us a start and checkout the library.
|
||||
@@ -0,0 +1,225 @@
|
||||
from typing import List, Optional
|
||||
from openai import OpenAI
|
||||
from pydantic import (
|
||||
BaseModel,
|
||||
Field,
|
||||
ValidationError,
|
||||
ValidationInfo,
|
||||
field_validator,
|
||||
model_validator,
|
||||
)
|
||||
|
||||
import instructor
|
||||
|
||||
client = instructor.patch(OpenAI())
|
||||
|
||||
"""
|
||||
Example 1) Simple Substring check that compares a citation to a text chunk
|
||||
"""
|
||||
|
||||
|
||||
class Statements(BaseModel):
|
||||
body: str
|
||||
substring_quote: str
|
||||
|
||||
@field_validator("substring_quote")
|
||||
@classmethod
|
||||
def substring_quote_exists(cls, v: str, info: ValidationInfo):
|
||||
context = info.context.get("text_chunks", None)
|
||||
|
||||
# Check if the substring_quote is in the text_chunk
|
||||
# if not, raise an error
|
||||
for text_chunk in context.values():
|
||||
if v in text_chunk:
|
||||
return v
|
||||
raise ValueError(
|
||||
f"Could not find substring_quote `{v}` in contexts",
|
||||
)
|
||||
|
||||
|
||||
class AnswerWithCitaton(BaseModel):
|
||||
question: str
|
||||
answer: List[Statements]
|
||||
|
||||
|
||||
try:
|
||||
AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Paris", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is not the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
"""
|
||||
answer.0.substring_quote
|
||||
Value error, Could not find substring_quote `Paris is the capital of France` in contexts [type=value_error, input_value='Paris is the capital of France', input_type=str]
|
||||
For further information visit https://errors.pydantic.dev/2.4/v/value_error
|
||||
"""
|
||||
|
||||
|
||||
"""
|
||||
Example 2) Using an LLM to verify if a
|
||||
"""
|
||||
|
||||
|
||||
class Validation(BaseModel):
|
||||
"""
|
||||
Verfication response from the LLM,
|
||||
the error message should be detailed if the is_valid is False
|
||||
but keep it to less than 100 characters, reference specific
|
||||
attributes that you are comparing, use `...` is the string is too long
|
||||
"""
|
||||
|
||||
is_valid: bool
|
||||
error_messages: Optional[str] = Field(None, description="Error messages if any")
|
||||
|
||||
|
||||
class Statements(BaseModel):
|
||||
body: str
|
||||
substring_quote: str
|
||||
|
||||
@model_validator(mode="after")
|
||||
def substring_quote_exists(self, info: ValidationInfo):
|
||||
context = info.context.get("text_chunks", None)
|
||||
|
||||
resp: Validation = client.chat.completions.create(
|
||||
response_model=Validation,
|
||||
messages=[
|
||||
{
|
||||
"role": "user",
|
||||
"content": f"Does the following citation exist in the following context?\n\nCitation: {self.substring_quote}\n\nContext: {context}",
|
||||
}
|
||||
],
|
||||
model="gpt-3.5-turbo",
|
||||
)
|
||||
|
||||
if resp.is_valid:
|
||||
return self
|
||||
|
||||
raise ValueError(resp.error_messages)
|
||||
|
||||
|
||||
class AnswerWithCitaton(BaseModel):
|
||||
question: str
|
||||
answer: List[Statements]
|
||||
|
||||
|
||||
resp = AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Paris", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
# output: notice that there are no errors
|
||||
print(resp.model_dump_json(indent=2))
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [{"body": "Paris", "substring_quote": "Paris is the capital of France"}],
|
||||
}
|
||||
|
||||
# Now we change the text chunk to something else, and we get an error
|
||||
try:
|
||||
AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Paris", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is not the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
"""
|
||||
1 validation error for AnswerWithCitaton
|
||||
answer.0
|
||||
Value error, Citation not found in context [type=value_error, input_value={'body': 'Paris', 'substr... the capital of France'}, input_type=dict]
|
||||
For further information visit https://errors.pydantic.dev/2.4/v/value_error
|
||||
"""
|
||||
|
||||
# Example 3) Using an LLM to verify if the citations and the answers are all aligned
|
||||
|
||||
|
||||
# we keep the same model as above for Statements, but we add a new model for the answer
|
||||
# that also verifies that the citations are aligned with the answers
|
||||
class AnswerWithCitaton(BaseModel):
|
||||
question: str
|
||||
answer: List[Statements]
|
||||
|
||||
@model_validator(mode="after")
|
||||
def validate_answer(self, info: ValidationInfo):
|
||||
context = info.context.get("text_chunks", None)
|
||||
|
||||
resp: Validation = client.chat.completions.create(
|
||||
response_model=Validation,
|
||||
messages=[
|
||||
{
|
||||
"role": "user",
|
||||
"content": f"Does the following answers match the question and the context?\n\nQuestion: {self.question}\n\nAnswer: {self.answer}\n\nContext: {context}",
|
||||
}
|
||||
],
|
||||
model="gpt-3.5-turbo",
|
||||
)
|
||||
|
||||
if resp.is_valid:
|
||||
return self
|
||||
|
||||
raise ValueError(resp.error_messages)
|
||||
|
||||
|
||||
"""
|
||||
Using LLMs for citation verification is inefficient during runtime.
|
||||
However, we can utilize them to create a dataset consisting only of accurate responses
|
||||
where citations must be valid (as determined by LLM, fuzzy text search, etc.).
|
||||
|
||||
This approach would require an initial investment during data generation to obtain
|
||||
a finely-tuned model for improved citation.
|
||||
"""
|
||||
try:
|
||||
AnswerWithCitaton.model_validate(
|
||||
{
|
||||
"question": "What is the capital of France?",
|
||||
"answer": [
|
||||
{"body": "Texas", "substring_quote": "Paris is the capital of France"},
|
||||
],
|
||||
},
|
||||
context={
|
||||
"text_chunks": {
|
||||
1: "Jason is a pirate",
|
||||
2: "Paris is the capital of France",
|
||||
3: "Irrelevant data",
|
||||
}
|
||||
},
|
||||
)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
"""
|
||||
1 validation error for AnswerWithCitaton
|
||||
Value error, The answer does not match the question and context [type=value_error, input_value={'question': 'What is the...he capital of France'}]}, input_type=dict]
|
||||
For further information visit https://errors.pydantic.dev/2.4/v/value_error
|
||||
"""
|
||||
Reference in New Issue
Block a user