Subscribe to our Newsletter for Updates and Tips¶

2024/09/07
in Pydantic
3 min read

Pydantic is Still All You Need: Reflections on a Year of Structured Outputs

A year ago, I gave a talk titled "Pydantic: All You Need" that kickstarted my Twitter career. Today, I'm back to reaffirm that message and share what I've learned in the past year about using structured outputs with language models.

Watch the youtube video

2024/09/03
in LLM Techniques
2 min read

Structured Outputs for Gemini now supported

We're excited to announce that instructor now supports structured outputs using tool calling for both the Gemini SDK and the VertexAI SDK.

A special shoutout to Sonal for his contributions to the Gemini Tool Calling support.

Let's walk through a simple example of how to use these new features

Installation

To get started, install the latest version of instructor. Depending on whether you're using Gemini or VertexAI, you should install the following:

GeminiVertexAI

pip install "instructor[google-generativeai]"

pip install "instructor[vertexai]"

This ensures that you have the necessary dependencies to use the Gemini or VertexAI SDKs with instructor.

We recommend using the Gemini SDK over the VertexAI SDK for two main reasons.

Compared to the VertexAI SDK, the Gemini SDK comes with a free daily quota of 1.5 billion tokens to use for developers.
The Gemini SDK is significantly easier to setup, all you need is a GOOGLE_API_KEY that you can generate in your GCP console. THe VertexAI SDK on the other hand requires a credentials.json file or an OAuth integration to use.

Getting Started

With our provider agnostic API, you can use the same interface to interact with both SDKs, the only thing that changes here is how we initialise the client itself.

Before running the following code, you'll need to make sure that you have your Gemini API Key set in your shell under the alias GOOGLE_API_KEY.

import instructor
import google.generativeai as genai
from pydantic import BaseModel


class User(BaseModel):
    name: str
    age: int


client = instructor.from_gemini(
    client=genai.GenerativeModel(
        model_name="models/gemini-1.5-flash-latest",  # (1)!
    )
)

resp = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Extract Jason is 25 years old.",
        }
    ],
    response_model=User,
)

print(resp)
#> name='Jason' age=25

Current Gemini models that support tool calling are gemini-1.5-flash-latest and gemini-1.5-pro-latest.

We can achieve a similar thing with the VertexAI SDK. For this to work, you'll need to authenticate to VertexAI.

There are some instructions here but the easiest way I found was to simply download the GCloud cli and run gcloud auth application-default login.

import instructor
import vertexai  # type: ignore
from vertexai.generative_models import GenerativeModel  # type: ignore
from pydantic import BaseModel

vertexai.init()


class User(BaseModel):
    name: str
    age: int


client = instructor.from_vertexai(
    client=GenerativeModel("gemini-1.5-pro-preview-0409"),  # (1)!
)


resp = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Extract Jason is 25 years old.",
        }
    ],
    response_model=User,
)

print(resp)
#> name='Jason' age=25

Current Gemini models that support tool calling are gemini-1.5-flash-latest and gemini-1.5-pro-latest.

2024/08/20
in OpenAI
7 min read

Should I Be Using Structured Outputs?

OpenAI recently announced Structured Outputs which ensures that generated responses match any arbitrary provided JSON Schema. In their announcement article, they acknowledged that it had been inspired by libraries such as instructor.

Main Challenges

If you're building complex LLM workflows, you've likely considered OpenAI's Structured Outputs as a potential replacement for instructor.

But before you do so, three key challenges remain:

Limited Validation And Retry Logic: Structured Outputs ensure adherence to the schema but not useful content. You might get perfectly formatted yet unhelpful responses
Streaming Challenges: Parsing raw JSON objects from streamed responses with the sdk is error-prone and inefficient
Unpredictable Latency Issues : Structured Outputs suffers from random latency spikes that might result in an almost 20x increase in response time

Additionally, adopting Structured Outputs locks you into OpenAI's ecosystem, limiting your ability to experiment with diverse models or providers that might better suit specific use-cases.

This vendor lock-in increases vulnerability to provider outages, potentially causing application downtime and SLA violations, which can damage user trust and impact your business reputation.

In this article, we'll show how instructor addresses many of these challenges with features such as automatic reasking when validation fails, automatic support for validated streaming data and more.

2024/07/17
in LLM Observability
4 min read

Parea for Observing, Testing & Fine-tuning of Instructor

Parea is a platform that enables teams to monitor, collaborate, test & label for LLM applications. In this blog we will explore how Parea can be used to enhance the OpenAI client alongside instructor and debug + improve instructor calls. Parea has some features which makes it particularly useful for instructor:

it automatically groups any LLM calls due to reties under a single trace
it automatically tracks any validation error counts & fields that occur when using instructor
it provides a UI to label JSON responses by filling out a form instead of editing JSON objects

Configure Parea

Before starting this tutorial, make sure that you've registered for a Parea account. You'll also need to create an API key.

Example: Writing Emails with URLs from Instructor Docs

We will demonstrate Parea by using instructor to write emails which only contain URLs from the instructor docs. We'll need to install our dependencies before proceeding so simply run the command below.

2024/07/11
in Data Processing
5 min read

Analyzing Youtube Transcripts with Instructor

Extracting Chapter Information

Code Snippets

As always, the code is readily available in our examples/youtube folder in our repo for your reference in the run.py file.

In this post, we'll show you how to summarise Youtube video transcripts into distinct chapters using instructor before exploring some ways you can adapt the code to different applications.

By the end of this article, you'll be able to build an application as per the video below.

2024/06/15
in LLM Techniques
3 min read

Why Instructor is the best way to get JSON from LLMs

Large Language Models (LLMs) like GPT are incredibly powerful, but getting them to return well-formatted JSON can be challenging. This is where the Instructor library shines. Instructor allows you to easily map LLM outputs to JSON data using Python type annotations and Pydantic models.

Instructor makes it easy to get structured data like JSON from LLMs like GPT-3.5, GPT-4, GPT-4-Vision, and open-source models including Mistral/Mixtral, Ollama, and llama-cpp-python.

It stands out for its simplicity, transparency, and user-centric design, built on top of Pydantic. Instructor helps you manage validation context, retries with Tenacity, and streaming Lists and Partial responses.

Instructor provides support for a wide range of programming languages, including:
Python
TypeScript
Ruby
Go
Elixir

2024/06/06
in LLM Techniques
3 min read

Enhancing RAG with Time Filters Using Instructor

Retrieval-augmented generation (RAG) systems often need to handle queries with time-based constraints, like "What new features were released last quarter?" or "Show me support tickets from the past week." Effective time filtering is crucial for providing accurate, relevant responses.

Instructor is a Python library that simplifies integrating large language models (LLMs) with data sources and APIs. It allows defining structured output models using Pydantic, which can be used as prompts or to parse LLM outputs.

2024/05/03
in LLM Observability
7 min read

Why Logfire is a perfect fit for FastAPI + Instructor

Logfire is a new tool that provides key insight into your application with Open Telemetry. Instead of using ad-hoc print statements, Logfire helps to profile every part of your application and is integrated directly into Pydantic and FastAPI, two popular libraries amongst Instructor users.

In short, this is the secret sauce to help you get your application to the finish line and beyond. We'll show you how to easily integrate Logfire into FastAPI, one of the most popular choices amongst users of Instructor using two examples

Data Extraction from a single User Query
Using asyncio to process multiple users in parallel
Streaming multiple objects using an Iterable so that they're available on demand

2024/05/01
in LLM Observability
6 min read

Logfire

Introduction

Logfire is a new observability platform coming from the creators of Pydantic. It integrates almost seamlessly with many of your favourite libraries such as Pydantic, HTTPx and Instructor. In this article, we'll show you how to use Logfire with Instructor to gain visibility into the performance of your entire application.

We'll walk through the following examples

Classifying scam emails using Instructor
Performing simple validation using the llm_validator
Extracting data into a markdown table from an infographic with GPT4V

2024/04/20
in Tutorial
5 min read

Unified Provider Interface with String-Based Initialization

Instructor now offers a simplified way to initialize any supported LLM provider with a single consistent interface. This approach makes it easier than ever to switch between different LLM providers while maintaining the same structured output functionality you rely on.

The Problem

As the number of LLM providers grows, so does the complexity of initializing and working with different client libraries. Each provider has its own initialization patterns, API structures, and quirks. This leads to code that isn't portable between providers and requires significant refactoring when you want to try a new model.

The Solution: String-Based Initialization

We've introduced a new unified interface that allows you to initialize any supported provider with a simple string format:

import instructor
from pydantic import BaseModel

class UserInfo(BaseModel):
    name: str
    age: int

# Initialize any provider with a single consistent interface
client = instructor.from_provider("openai/gpt-4")
client = instructor.from_provider("anthropic/claude-3-sonnet")
client = instructor.from_provider("google/gemini-pro")
client = instructor.from_provider("mistral/mistral-large")

The from_provider function takes a string in the format "provider/model-name" and handles all the details of setting up the appropriate client with the right model. This provides several key benefits:

Simplified Initialization: No need to manually create provider-specific clients
Consistent Interface: Same syntax works across all providers
Reduced Dependency Exposure: You don't need to import specific provider libraries in your application code
Easy Experimentation: Switch between providers with a single line change

Supported Providers

The string-based initialization currently supports all major providers in the ecosystem:

OpenAI: "openai/gpt-4", "openai/gpt-4o", "openai/gpt-3.5-turbo"
Anthropic: "anthropic/claude-3-opus-20240229", "anthropic/claude-3-sonnet-20240229", "anthropic/claude-3-haiku-20240307"
Google Gemini: "google/gemini-pro", "google/gemini-pro-vision"
Mistral: "mistral/mistral-small-latest", "mistral/mistral-medium-latest", "mistral/mistral-large-latest"
Cohere: "cohere/command", "cohere/command-r", "cohere/command-light"
Perplexity: "perplexity/sonar-small-online", "perplexity/sonar-medium-online"
Groq: "groq/llama2-70b-4096", "groq/mixtral-8x7b-32768", "groq/gemma-7b-it"
Writer: "writer/palmyra-instruct", "writer/palmyra-instruct-v2"
AWS Bedrock: "bedrock/anthropic.claude-v2", "bedrock/amazon.titan-text-express-v1"
Cerebras: "cerebras/cerebras-gpt", "cerebras/cerebras-gpt-2.7b"
Fireworks: "fireworks/llama-v2-70b", "fireworks/firellama-13b"
Vertex AI: "vertexai/gemini-pro", "vertexai/text-bison"
Google GenAI: "genai/gemini-pro", "genai/gemini-pro-vision"

Each provider will be initialized with sensible defaults, but you can also pass additional keyword arguments to customize the configuration. For model-specific details, consult each provider's documentation.

Async Support

The unified interface fully supports both synchronous and asynchronous clients:

# Synchronous client (default)
client = instructor.from_provider("openai/gpt-4")

# Asynchronous client
async_client = instructor.from_provider("anthropic/claude-3-sonnet", async_client=True)

# Use like any other async client
response = await async_client.chat.completions.create(
    response_model=UserInfo,
    messages=[{"role": "user", "content": "Extract information about John who is 30 years old"}]
)

Mode Selection

You can also specify which structured output mode to use with the provider:

import instructor
from instructor import Mode

# Override the default mode for a provider
client = instructor.from_provider(
    "anthropic/claude-3-sonnet",
    mode=Mode.ANTHROPIC_TOOLS
)

# Use JSON mode instead of the default tools mode
client = instructor.from_provider(
    "mistral/mistral-large",
    mode=Mode.MISTRAL_STRUCTURED_OUTPUTS
)

# Use reasoning tools instead of regular tools for Anthropic
client = instructor.from_provider(
    "anthropic/claude-3-opus",
    mode=Mode.ANTHROPIC_REASONING_TOOLS
)

If not specified, each provider will use its recommended default mode:

OpenAI: Mode.OPENAI_FUNCTIONS
Anthropic: Mode.ANTHROPIC_TOOLS
Google Gemini: Mode.GEMINI_JSON
Mistral: Mode.MISTRAL_TOOLS
Cohere: Mode.COHERE_TOOLS
Perplexity: Mode.JSON
Groq: Mode.GROQ_TOOLS
Writer: Mode.WRITER_JSON
Bedrock: Mode.ANTHROPIC_TOOLS (for Claude on Bedrock)
Vertex AI: Mode.VERTEXAI_TOOLS

You can always customize this based on your specific needs and model capabilities.

Error Handling

The from_provider function includes robust error handling to help you quickly identify and fix issues:

# Missing dependency
try:
    client = instructor.from_provider("anthropic/claude-3-sonnet")
except ImportError as e:
    print("Error: Install the anthropic package first")
    # pip install anthropic

# Invalid provider format
try:
    client = instructor.from_provider("invalid-format")
except ValueError as e:
    print(e)  # Model string must be in format "provider/model-name"

# Unsupported provider
try:
    client = instructor.from_provider("unknown/model")
except ValueError as e:
    print(e)  # Unsupported provider: unknown. Supported providers are: ...

The function validates the provider string format, checks if the provider is supported, and ensures the necessary packages are installed.

Environment Variables

Like the native client libraries, from_provider respects environment variables set for each provider:

# Set environment variables
import os
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"
os.environ["MISTRAL_API_KEY"] = "your-mistral-key"

# No need to pass API keys directly
client = instructor.from_provider("openai/gpt-4")

Troubleshooting

Here are some common issues and solutions when using the unified provider interface:

Model Not Found Errors

If you receive a 404 error, check that you're using the correct model name format:

Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-3-haiku'}}

For Anthropic models, always include the version date: - ✅ Correct: anthropic/claude-3-haiku-20240307 - ❌ Incorrect: anthropic/claude-3-haiku

Provider-Specific Parameters

Some providers require specific parameters for API calls:

# Anthropic requires max_tokens
anthropic_client = instructor.from_provider(
    "anthropic/claude-3-haiku-20240307",
    max_tokens=400  # Required for Anthropic
)

# Use models with vision capabilities for multimodal content
gemini_client = instructor.from_provider(
    "google/gemini-pro-vision"  # Required for image processing
)

Working Example

Here's a complete example that demonstrates the automodel functionality with multiple providers:

import os
import asyncio
import instructor
from pydantic import BaseModel, Field

class UserInfo(BaseModel):
    """User information extraction model."""
    name: str = Field(description="The user's full name")
    age: int = Field(description="The user's age in years")
    occupation: str = Field(description="The user's job or profession")

async def main():
    # Test OpenAI
    openai_client = instructor.from_provider("openai/gpt-3.5-turbo")
    openai_result = openai_client.chat.completions.create(
        response_model=UserInfo,
        messages=[{"role": "user", "content": "Jane Doe is a 28-year-old data scientist."}]
    )
    print(f"OpenAI result: {openai_result.model_dump()}")

    # Test Anthropic with async client
    if os.environ.get("ANTHROPIC_API_KEY"):
        anthropic_client = instructor.from_provider(
            model="anthropic/claude-3-haiku-20240307",
            async_client=True,
            max_tokens=400  # Required for Anthropic
        )
        anthropic_result = await anthropic_client.chat.completions.create(
            response_model=UserInfo,
            messages=[{"role": "user", "content": "John Smith is a 35-year-old software engineer."}]
        )
        print(f"Anthropic result: {anthropic_result.model_dump()}")

if __name__ == "__main__":
    asyncio.run(main())

Conclusion

String-based initialization is a significant step toward making Instructor even more user-friendly and flexible. It reduces the learning curve for working with multiple providers and makes it easier than ever to experiment with different models.

Benefits include: - Simplified initialization with a consistent interface - Automatic selection of appropriate default modes - Support for both synchronous and asynchronous clients - Clear error messages to quickly identify issues - Respect for provider-specific environment variables - Comprehensive model selection across the entire LLM ecosystem

Whether you're building a new application or migrating an existing one, the unified provider interface offers a cleaner, more maintainable way to work with structured outputs across the LLM ecosystem.

Try it today with instructor.from_provider() and check out the complete example code in our repository!