Extracting Receipt Data using GPT-4 and Python¶
This post demonstrates how to use Python's Pydantic library and OpenAI's GPT-4 model to extract receipt data from images and validate the total amount. This method is particularly useful for automating expense tracking and financial analysis tasks.
Defining the Item and Receipt Classes¶
First, we define two Pydantic models, Item
and Receipt
, to structure the extracted data. The Item
class represents individual items on the receipt, with fields for name, price, and quantity. The Receipt
class contains a list of Item
objects and the total amount.
from pydantic import BaseModel
class Item(BaseModel):
name: str
price: float
quantity: int
class Receipt(BaseModel):
items: list[Item]
total: float
Validating the Total Amount¶
To ensure the accuracy of the extracted data, we use Pydantic's model_validator
decorator to define a custom validation function, check_total
. This function calculates the sum of item prices and compares it to the extracted total amount. If there's a discrepancy, it raises a ValueError
.
from pydantic import model_validator
@model_validator(mode="after")
def check_total(self):
items = self.items
total = self.total
calculated_total = sum(item.price * item.quantity for item in items)
if calculated_total != total:
raise ValueError(
f"Total {total} does not match the sum of item prices {calculated_total}"
)
return self
Extracting Receipt Data from Images¶
The extract_receipt
function uses OpenAI's GPT-4 model to process an image URL and extract receipt data. We utilize the instructor
library to configure the OpenAI client for this purpose.
import instructor
from openai import OpenAI
client = instructor.from_openai(OpenAI())
def extract(url: str) -> Receipt:
return client.chat.completions.create(
model="gpt-4",
max_tokens=4000,
response_model=Receipt,
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {"url": url},
},
{
"type": "text",
"text": "Analyze the image and return the items in the receipt and the total amount.",
},
],
}
],
)
Practical Examples¶
In these examples, we apply the method to extract receipt data from two different images. The custom validation function ensures that the extracted total amount matches the sum of item prices.
url = "https://templates.mediamodifier.com/645124ff36ed2f5227cbf871/supermarket-receipt-template.jpg"
receipt = extract(url)
print(receipt)
"""
items=[Item(name='Lorem ipsum', price=9.2, quantity=1), Item(name='Lorem ipsum dolor sit', price=19.2, quantity=1), Item(name='Lorem ipsum dolor sit amet', price=15.0, quantity=1), Item(name='Lorem ipsum', price=15.0, quantity=1), Item(name='Lorem ipsum', price=15.0, quantity=1), Item(name='Lorem ipsum dolor sit', price=15.0, quantity=1), Item(name='Lorem ipsum', price=19.2, quantity=1)] total=107.6
"""
By combining the power of GPT-4 and Python's Pydantic library, we can accurately extract and validate receipt data from images, streamlining expense tracking and financial analysis tasks.