Structured outputs with LiteLLM, a complete guide w/ instructor¶

LiteLLM provides a unified interface for multiple LLM providers, making it easy to switch between different models and providers. This guide shows you how to use Instructor with LiteLLM for type-safe, validated responses across various LLM providers.

Quick Start¶

Install Instructor with LiteLLM support:

pip install "instructor[litellm]"

Simple User Example (Sync)¶

from litellm import completion
import instructor
from pydantic import BaseModel

# Enable instructor patches
client = instructor.from_provider("litellm/gpt-3.5-turbo")

class User(BaseModel):
    name: str
    age: int

# Create structured output
user = client.chat.completions.create(
    messages=[
        {"role": "user", "content": "Extract: Jason is 25 years old"},
    ],
    response_model=User,
)

print(user)  # User(name='Jason', age=25)

Simple User Example (Async)¶

import instructor
from pydantic import BaseModel
import asyncio

client = instructor.from_provider(
    "litellm/gpt-3.5-turbo",
    async_client=True,
)


class User(BaseModel):
    name: str
    age: int


async def extract_user():
    user = await client.chat.completions.create(
        messages=[
            {"role": "user", "content": "Extract: Jason is 25 years old"},
        ],
        response_model=User,
    )
    return user


# Run async function
user = asyncio.run(extract_user())
print(user)  # User(name='Jason', age=25)

Cost Calculation¶

In order to calculate the cost of the response, LiteLLM provides a simple response_cost attribute on the response object's _hidden_params attribute. This is recorded in their documentation here.

Here is a code snippet using instructor to calculate the cost of the response:

import instructor
from litellm import completion
from pydantic import BaseModel


class User(BaseModel):
    name: str
    age: int


client = instructor.from_provider("litellm/gpt-3.5-turbo")
instructor_resp, raw_completion = client.chat.completions.create_with_completion(
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Extract Jason is 25 years old.",
        }
    ],
    response_model=User,
)

print(raw_completion._hidden_params["response_cost"])
#> 0.00189

Updates and Compatibility¶

Instructor maintains compatibility with LiteLLM's latest releases. Check the changelog for updates.

Note: Always verify provider-specific features and limitations in their respective documentation before implementation.