Your First Extraction¶
This guide walks you through creating your first structured extraction with Instructor, focusing on the basics with a simple example.
Basic Example¶
Let's extract a person's name and age from text:
from pydantic import BaseModel
import instructor
from openai import OpenAI
# 1. Define your data model
class Person(BaseModel):
name: str
age: int
# 2. Set up the Instructor client
client = instructor.from_openai(OpenAI())
# 3. Extract structured data
person = client.chat.completions.create(
model="gpt-3.5-turbo",
response_model=Person,
messages=[
{"role": "user", "content": "John Doe is 30 years old"}
]
)
# 4. Use the structured data
print(f"Name: {person.name}, Age: {person.age}")
# Output: Name: John Doe, Age: 30
How It Works¶
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Define │ -> │ Instruct LLM │ -> │ Get Typed │
│ Structure │ │ to Extract │ │ Response │
└─────────────┘ └──────────────┘ └─────────────┘
Let's break down each part:
1. Define Your Data Model¶
This tells Instructor what to extract: - name
: A text string - age
: A whole number
2. Set Up the Client¶
This wraps the OpenAI client with Instructor's functionality.
3. Extract Structured Data¶
person = client.chat.completions.create(
model="gpt-3.5-turbo",
response_model=Person,
messages=[
{"role": "user", "content": "John Doe is 30 years old"}
]
)
The key parts are: - model
: The LLM to use - response_model
: Your data structure definition - messages
: The text to extract from
4. Use the Structured Data¶
The result is a Python object with type-safe properties.
Adding Field Descriptions¶
You can add descriptions to help guide the extraction:
from pydantic import BaseModel, Field
class Person(BaseModel):
name: str = Field(description="Person's full name")
age: int = Field(description="Person's age in years")
These descriptions help the model understand what to extract.
Handling Missing Information¶
If information might be missing, make fields optional:
from typing import Optional
class Person(BaseModel):
name: str
age: Optional[int] = None # Now age is optional
Next Steps¶
Now that you've created your first extraction, you can: 1. Learn about more complex Response Models 2. Set up different Client Configurations 3. Explore Simple Object Extraction