- How it works: Runs multiple existing evaluations and combines their results using different mathematical approaches (average, weighted, or custom formula). Note that sub-evaluations do not create their own results!
- Best for: Holistic quality assessment, combining multiple evaluation criteria, creating overall performance metrics, balancing trade-offs between different aspects (e.g., accuracy vs. safety).
- Requires: At least two existing evaluations configured on the same prompt. These evaluations can be of any type, even other Composite Scores!
Currently, composite evaluations cannot run in live mode and only support
sub-evaluations that do not require an expected output. Check out the Running
Evaluations guide.
Setup
1
Go to evaluations tab
Go to evaluations tab on a prompt in one of your projects.
2
Combine evaluations
On the top right corner, click on the “Combine evaluations” button.
3
Choose a metric

4
Select sub-evaluations
Select the evaluations you want to combine. You need to select at least two
evaluations. 

Metrics
Average
Combines scores evenly. The resulting score is the average.
Weighted
Combines scores using custom weights. The resulting score is the weighted
blend. Weights are measured in percentage and must add up to 100%.
Custom
Combines scores using a custom formula. The resulting score is the result of
the expression. The expression can be a complex mathematical formula.