Instructor
Instrument & test instructor
code with Parea AI
Instructor makes it easy to reliably get structured data like JSON from LLMs.
Parea’s instructor
integration provides these features:
- groups any LLM calls due to retries together under a single trace
- tracks any field which failed validation with the respective error message
- visualizes validation error count over time
- the annotation queue provides a UI to label JSON responses by filling out a form instead of editing JSON objects
Quickstart
First, create a Parea API key as shown here.
Then, you will need to wrap the OpenAI client with Parea using p.wrap_openai_client(client, "instructor")
.
Finally, you can use instructor.patch
/ instructor.from_openai
to patch the OpenAI client with Instructor.
In a single code snippet:
Visualizing traces & validation errors
In your Parea logs dashboard, you can visualize your traces and see the detailed steps the LLM took including examining the structured output and the “functions/tools” instructor attached to the LLM call.
To take a look at trace of this execution checkout the screenshot below. Noticeable:
- left sidebar: all related LLM calls are grouped under a trace called
instructor
- middle section: the root trace visualizes the
templated_inputs
as inputs and the createdEmail
object as output - bottom of right sidebar: any validation errors are captured and tracked as score for the trace which enables visualizing them in dashboards and filtering by them on tables
Tracking & visualizing the validation error count over time.
Here is the Email
function schema we passed to OpenAI.
Improving LLMs for Structured Output Generation
In order to improve the performance of your function call responses, you can send the requests to an annotation queue. In that annotation queue, non-engineers can easily label the function call responses by filling out a form, and add the corrected responses to a dataset which you can use for fine-tuning.
Fully Working Example
Below you can see a fully-working example code which uses Instructor to classify questions into different types.