OpenAI¶

2024/12/11
in OpenAI, Multimodal
5 min read

Extracting Metadata from Images using Structured Extraction

Multimodal Language Models like gpt-4o excel at processing multimodal, enabling us to extract rich, structured metadata from images.

This is particularly valuable in areas like fashion where we can use these capabilities to understand user style preferences from images and even videos. In this post, we'll see how to use instructor to map images to a given product taxonomy so we can recommend similar products for users.

2024/12/10
in OpenAI
5 min read

Consistent Stories with GPT-4o

Language Models struggle to generate consistent graphs that have a large number of nodes. Often times, this is because the graph itself is too large for the model to handle. This causes the model to generate inconsistent graphs that have invalid and disconnected nodes among other issues.

In this article, we'll look at how we can get around this limitation by using a two-phase approach to generate complex DAGs with gpt-4o by looking at a simple example of generating a Choose Your Own Adventure story.

2024/12/10
in OpenAI
5 min read

Consistent Stories with GPT-4o

Language Models struggle to generate consistent graphs that have a large number of nodes. Often times, this is because the graph itself is too large for the model to handle. This causes the model to generate inconsistent graphs that have invalid and disconnected nodes among other issues.

In this article, we'll look at how we can get around this limitation by using a two-phase approach to generate complex DAGs with gpt-4o by looking at a simple example of generating a Choose Your Own Adventure story.

2024/11/10
in Google, OpenAI
5 min read

Do I Still Need Instructor with Google's New OpenAI Integration?

Google recently launched OpenAI client compatibility for Gemini.

While this is a significant step forward for developers by simplifying Gemini model interactions, you absolutely still need instructor.

If you're unfamiliar with instructor, we provide a simple interface to get structured outputs from LLMs across different providers.

This makes it easy to switch between providers, get reliable outputs from language models and ultimately build production grade LLM applications.

2024/10/17
in OpenAI, Audio
2 min read

Audio Support in OpenAI's Chat Completions API

OpenAI has recently introduced audio support in their Chat Completions API, opening up exciting new possibilities for developers working with audio and text interactions. This feature is powered by the new gpt-4o-audio-preview model, which brings advanced voice capabilities to the familiar Chat Completions API interface.

2024/10/02
in OpenAI
3 min read

OpenAI API Model Distillation with Instructor

OpenAI has recently introduced a new feature called API Model Distillation, which allows developers to create custom models tailored to their specific use cases. This feature is particularly powerful when combined with Instructor's structured output capabilities. In this post, we'll explore how to leverage API Model Distillation with Instructor to create more efficient and specialized models.

2024/08/20
in OpenAI
7 min read

Should I Be Using Structured Outputs?

OpenAI recently announced Structured Outputs which ensures that generated responses match any arbitrary provided JSON Schema. In their announcement article, they acknowledged that it had been inspired by libraries such as instructor.

Main Challenges

If you're building complex LLM workflows, you've likely considered OpenAI's Structured Outputs as a potential replacement for instructor.

But before you do so, three key challenges remain:

Limited Validation And Retry Logic: Structured Outputs ensure adherence to the schema but not useful content. You might get perfectly formatted yet unhelpful responses
Streaming Challenges: Parsing raw JSON objects from streamed responses with the sdk is error-prone and inefficient
Unpredictable Latency Issues : Structured Outputs suffers from random latency spikes that might result in an almost 20x increase in response time

Additionally, adopting Structured Outputs locks you into OpenAI's ecosystem, limiting your ability to experiment with diverse models or providers that might better suit specific use-cases.

This vendor lock-in increases vulnerability to provider outages, potentially causing application downtime and SLA violations, which can damage user trust and impact your business reputation.

In this article, we'll show how instructor addresses many of these challenges with features such as automatic reasking when validation fails, automatic support for validated streaming data and more.

2024/04/01
in OpenAI
5 min read

Announcing instructor=1.0.0

Over the past 10 months, we've build up instructor with the principle of 'easy to try, and easy to delete'. We accomplished this by patching the openai client with the instructor package and adding new arguments like response_model, max_retries, and context. As a result I truly believe isntructor is the best way to get structured data out of llm apis.

But as a result, we've been a bit stuck on getting typing to work well while giving you more control at development time. I'm excited to launch version 1.0.0 which cleans up the api w.r.t. typing without compromising the ease of use.

2024/02/14
in OpenAI
1 min read

Free course on Weights and Biases

I just released a free course on wits and biases. It goes over the material from tutorial. Check it out at wandb.courses its free and open to everyone and just under an hour long!

Click the image to access the course