Skip to content

Structured outputs with LiteLLM, a complete guide w/ instructor

LiteLLM provides a unified interface for multiple LLM providers, making it easy to switch between different models and providers. This guide shows you how to use Instructor with LiteLLM for type-safe, validated responses across various LLM providers.

Quick Start

Install Instructor with LiteLLM support:

pip install "instructor[litellm]"

Simple User Example (Sync)

from litellm import completion
import instructor
from pydantic import BaseModel

# Enable instructor patches
client = instructor.from_litellm(completion)

class User(BaseModel):
    name: str
    age: int

# Create structured output
user = client.chat.completions.create(
    model="gpt-3.5-turbo",  # Can use any supported model
    messages=[
        {"role": "user", "content": "Extract: Jason is 25 years old"},
    ],
    response_model=User,
)

print(user)  # User(name='Jason', age=25)

Simple User Example (Async)

from litellm import acompletion
import instructor
from pydantic import BaseModel
import asyncio

# Enable instructor patches for async
client = instructor.from_litellm(acompletion)


class User(BaseModel):
    name: str
    age: int


async def extract_user():
    user = await client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": "Extract: Jason is 25 years old"},
        ],
        response_model=User,
    )
    return user


# Run async function
user = asyncio.run(extract_user())
print(user)  # User(name='Jason', age=25)

Cost Calculation

In order to calculate the cost of the response, LiteLLM provides a simple response_cost attribute on the response object's _hidden_params attribute. This is recorded in their documentation here.

Here is a code snippet using instructor to calculate the cost of the response:

import instructor
from litellm import completion
from pydantic import BaseModel


class User(BaseModel):
    name: str
    age: int


client = instructor.from_litellm(completion)
instructor_resp, raw_completion = client.chat.completions.create_with_completion(
    model="claude-3-5-sonnet-20240620",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Extract Jason is 25 years old.",
        }
    ],
    response_model=User,
)

print(raw_completion._hidden_params["response_cost"])
#> 0.00189

Updates and Compatibility

Instructor maintains compatibility with LiteLLM's latest releases. Check the changelog for updates.

Note: Always verify provider-specific features and limitations in their respective documentation before implementation.