Test case collections
Making sure that your LLM powered applications works well in expected (and unexpected) scenarios is important. For that, we have the concept of a test case collection which is a dataset of test cases. A test case consists of a set of inputs, an optional target/gold standard answer, and optional tags. The inputs are key value pairs which represent the input to your prompt, chain or agent. The target is the expected answer given the input. The tags allow you to filter test cases at different places on the platform. A test case, except for its tags, is immutable after it has been created.
You can use the test case collections in the Lab, Prompt IDE or benchmarking job to test your prompts, chains & agents.
Defining test case collections
Upload a CSV file
You can create a test case collection by uploading a CSV file in the test hub. The rows of the CSV file
are the test cases which will be imported. If the target
column is present it will be used as the gold standard
answer. Similarly, if the tags
column is present, it will as tags and is expected to be a comma-separated list
(e.g. tag1, tag2
). The remaining columns will be used to define the inputs of your test cases.
From experiments & production traffic logs
As you experiment in the Lab or in the Prompt IDE, you can use
the Add to test case collection
button to add the current inputs to a test case collection. You can also define test
cases from your production traffic logs.