Should I Be Using Structured Outputs?
OpenAI recently announced Structured Outputs which ensures that generated responses match any arbitrary provided JSON Schema. In their announcement article, they acknowledged that it had been inspired by libraries such as instructor
.
Main Challenges
If you're building complex LLM workflows, you've likely considered OpenAI's Structured Outputs as a potential replacement for instructor
.
But before you do so, three key challenges remain:
- Limited Validation And Retry Logic: Structured Outputs ensure adherence to the schema but not useful content. You might get perfectly formatted yet unhelpful responses
- Streaming Challenges: Parsing raw JSON objects from streamed responses with the sdk is error-prone and inefficient
- Unpredictable Latency Issues : Structured Outputs suffers from random latency spikes that might result in an almost 20x increase in response time
Additionally, adopting Structured Outputs locks you into OpenAI's ecosystem, limiting your ability to experiment with diverse models or providers that might better suit specific use-cases.
This vendor lock-in increases vulnerability to provider outages, potentially causing application downtime and SLA violations, which can damage user trust and impact your business reputation.
In this article, we'll show how instructor
addresses many of these challenges with features such as automatic reasking when validation fails, automatic support for validated streaming data and more.