The Datasets tab is where you can create and manage reusable data. A dataset is a collection of test cases which consist of three things:

  • Inputs: prompt variable values that are interpolated into the prompt template of your model config at generation time (i.e., they replace the {{ variables }} you define in the prompt template.
  • Target: Represents the expected or intended output of the model.
  • Tags: Any string metadata tags you want to attach to a test case.

Creating Datasets

From uploaded CSV

You can create a dataset by uploading a CSV file to the Datasets tab. Go to Datasets tab and click Upload file to create dataset. In the following modal, provide a name for your dataset and upload your file.

Upload a dataset

Each row from the CSV file represents a test case. The column names represent the prompt template input variable names.

For example, if the prompt template is:

prompt template
Please solve the following math question: {{ question }}

Then the CSV file would need a column named question.

Different delimiter types are supported, including comma, tab, pipe, and semicolon.

The names target and tags are reserved.

If a target column is present, it will be used as the gold standard answer for that row’s output. For example, in the CSV below, the target 4 is the expected answer to the question What is 2 + 2.

CSV
question, target
What is 2 + 2, 4

If a tags column is present, it will be used as metadata tags for a specific row. Tags should be comma-separated with no spaces.

For example, in the CSV below, row one has been tagged as easy and arithmetic, and the second row hard and calculus.

CSV
question, target, tags
What is 2 + 2, 4, easy, arithmetic
Evaluate ∫(1 / x^4 + 1)dx, x - 1/3(x^-3) + C, hard,calculus

Tags are helpful for filtering test cases in the playground. If you have imported 10 cases in the playground and run an evaluation on all cases, then you can filter the cases by tag and see the average score for only the selected cases. This helps you understand whether your prompt performs well on specific test cases.

From the SDK

Using the Python or TypeScript SDK, you can create, read and update datasets programmatically.

See the api-reference for more details.

From the playground

In the playground, after you click Add test case, you can optionally select Upload new dataset to upload a CSV file.

From trace logs

See Observability - Datasets for more details.

From annotation queues

See labeling in annotation queues for more details.

Exporting Datasets

You can use the Python or TypeScript SDK to export datasets via the Get Dataset endpoint (API docs).

Converting to JSONL for fine-tuning

You can use helper functions in the SDK to download and then convert a dataset to JSONL format for fine-tuning.

Where can I use datasets?