Skip to content

Streaming Lists

This guide explains how to stream lists of structured data with Instructor. Streaming lists allows you to process collection items as they're generated, improving responsiveness for larger outputs.

Basic List Streaming

Here's how to stream a list of structured objects:

from typing import List
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field

# Initialize the client
client = instructor.from_openai(OpenAI())

class Book(BaseModel):
    title: str = Field(..., description="Book title")
    author: str = Field(..., description="Book author")
    year: int = Field(..., description="Publication year")

# Stream a list of books
for book in client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "List 5 classic science fiction books"}
    ],
    response_model=List[Book],  # Note: Using List directly
    stream=True
):
    print(f"Received: {book.title} by {book.author} ({book.year})")

This example shows how to: 1. Define a Pydantic model for each list item 2. Use Python's typing system to specify a list 3. Process each item as it arrives in the stream

Real-world Example: Task Generation

Here's a practical example of streaming a list of tasks with progress tracking:

from typing import List
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
import time

client = instructor.from_openai(OpenAI())

class Task(BaseModel):
    title: str = Field(..., description="Task title")
    description: str = Field(..., description="Detailed task description")
    priority: str = Field(..., description="Task priority (High/Medium/Low)")
    estimated_hours: float = Field(..., description="Estimated hours to complete")

print("Generating project tasks...")
start_time = time.time()
received_tasks = 0

for task in client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Generate a list of 5 tasks for building a personal website"}
    ],
    response_model=List[Task],
    stream=True
):
    received_tasks += 1
    print(f"\nTask {received_tasks}: {task.title} (Priority: {task.priority})")
    print(f"Description: {task.description[:100]}...")
    print(f"Estimated time: {task.estimated_hours} hours")

    # Calculate progress percentage based on expected items
    progress = (received_tasks / 5) * 100
    print(f"Progress: {progress:.0f}%")

elapsed_time = time.time() - start_time
print(f"\nAll {received_tasks} tasks generated in {elapsed_time:.2f} seconds")

Next Steps