Experiments allow you to get a summary of the quality of your LLM application. On the Experiments tab you can see the aggregated stats of the experiment. At the bottom, you will notice a table with the individual traces of the experiment. You will see a detailed trace that you can step through by clicking on a row. You can also pin additional statistics to the experiment (e.g., median or Spearman correlation of measurements) by clicking on the Pin Stat button.

Experiment

You can run an experiment for your application by calling the experiment function of a Parea client to define the Experiment class and passing it the data, and the function you want to run. You must annotate the function with the trace decorator to trace its inputs, outputs, latency, etc. and to specify which evaluation functions should be executed.

You can optionally specify the name of the experiment, otherwise it will be automatically generated. Note, the name of the experiment must be unique for the project and can only contain alphanumeric characters, dashes, and underscores.

  • Python

  • TypeScript

import os

from dotenv import load_dotenv

from parea import Parea, trace
from parea.evals import call_openai
from parea.schemas import Log

load_dotenv()

p = Parea(api_key=os.getenv("PAREA_API_KEY"))


# Evaluation function(s)
def is_between_1_and_n(log: Log) -> float:
    """Evaluates if the number is between 1 and n"""
    n = log.inputs["n"]
    try:
        return 1.0 if 1.0 <= float(log.output) <= float(n) else 0.0
    except ValueError:
        return 0.0


# annotate the function with the trace decorator and pass the evaluation function(s)
@trace(eval_funcs=[is_between_1_and_n])
def generate_random_number(n: str) -> str:
    return call_openai(
        [
            {"role": "user", "content": f"Generate a number between 1 and {n}."},
        ],
        model="gpt-3.5-turbo",
    )


# Define the experiment
# You can use the CLI command to execute this experiment
p.experiment(
    data=[{"n": "10"}],           # Data to run the experiment on (list of k/v pairs)
    func=generate_random_number,  # Function to run (callable)
)

# You can optionally run the experiment manually by calling `.run()`
# p.experiment(
#    data=[{"n": "10"}],
#    func=generate_random_number,
# ).run(name="random-numbers") # experiment name; must be unique in project; only alphanumeric, hypens, and underscores allowed. If no name provided a random name will be generated.

Then, you can run the experiment using the experiment CLI command and give it the path to the Python file. This will run your experiment with the specified inputs and create a report with the results, which can be viewed under the Experiments tab.

parea experiment <path/to/experiment_file.py>

You can optionally specify the name of the experiment using the --name flag in the CLI or by passing it to the run function.

Add Metadata to Experiments

When running an experiment, you can add metadata to the experiment by passing a dictionary. These metadata will be displayed on the experiment overview table and can be used to filter and search for experiments.

p.experiment(
    data=data,
    func=func,
    metadata={"param1": "value1"},
).run()

Experiment Metadata

Use Saved Datasets

When running and experiment, you can use your datasets saved on Parea. For the data field just provide the name of the dataset as defined on the Datasets tab. The dataset should have column names that match the input parameters of the function you are running the experiment on. Note, the dataset name will be automatically stored in the “Dataset” key for the experiment metadata.

p.experiment(
    data="Dataset Name",
    func=func,
).run()

Sharing Experiments Publicly

All your experiments are shared by default in your organization and not publicly accessible. You can share experiments publicly by clicking on the Share button on the top right of the experiment page. This will generate a link following the format https://app.parea.ai/public-experiments/<org_slug>/<project_name>/<experiment_uuid> which anyone can access. You can compare all public experiments in a project under https://app.parea.ai/public-experiments/<org_slug>/<project_name>.

Visible Experiment

Projects

You can organize your experiments by projects. By default all experiments are created in the default project. You can learn more about projects here.