Although Parea helps you to automatically evaluate AI software, human review is a critical part of the process. For that Parea supports annotating and commenting on trace logs as well as labeling the outputs to curate datasets. With that you can e.g. annotate logs from experiments to get feedback from subject-matter experts, or you can comment on logs to discuss them with your team, or create “golden” datasets.

For that Parea enables users to define annotation criteria which outline how to annotate responses. You can annotate, comment and labels from within the detailed log view or in a queue. Finally, you can use the manually annotations to automate some of these processes by creating LLM judges which are aligned with the manual annotations.