Using Structured Outputs to convert messy tables into tidy data
Why is this a problem?
Messy data exports are a common problem. Whether it's multiple headers in the table, implicit relationships that make analysis a pain or even just merged cells, using instructor
with structured outputs makes it easy to convert messy tables into tidy data, even if all you have is just an image of the table as we'll see below.
Let's look at the following table as an example. It makes analysis unnecessarily difficult because it hides data relationships through empty cells and implicit repetition. If we were using it for data analysis, cleaning it manually would be a huge nightmare.