How to use the Calibration Performance report

Navigation prompt

Go to REPORTING > click Calibration performance

The Calibration Performance report helps you analyse how consistently evaluators score during calibration sessions. It compares "blind" scores (made independently) against "calibrated" scores (agreed upon after discussion), helping you identify scoring variations and training opportunities.

Required permissions

Feature flag	Technical name	Description
Calibration	`feature_calibration`	Enables calibration sessions (included by default)
KPI Reports	`feature_kpi_reports`	Enables KPI reporting (included by default)

Required permissions:

Calibration Performance (Full View) — to see all evaluators
Calibration Performance (Scoped View) — to see evaluators in your reporting hierarchy

Note: Contact your evaluagent administrator if you don't have access to these features.

Navigation

Navigate to Reporting in the main menu
Click on Calibration Performance
The report loads with default filters

Report views

Evaluator view

The default view showing calibration performance by individual evaluator:

Evaluator name — The evaluator being measured
Sessions facilitated — Number of sessions they led
Sessions attended — Number of sessions they participated in
Evaluations blind-scored — Number of evaluations scored independently
Outcome calibration performance — Match rate for overall outcomes
Average outcome difference — Average variance in scores
Line items blind-scored — Number of individual criteria scored
Line item calibration performance — Match rate for line items
Root cause calibration performance — Match rate for root cause selections

Scorecard view

Shows calibration performance grouped by evaluation form:

Scorecard name — The evaluation form being measured
Evaluations blind-scored — Total blind evaluations for this form
Outcome calibration performance — Match rate for outcomes
Line items blind-scored — Total line items scored
Line item calibration performance — Match rate for line items
Root cause calibration performance — Match rate for root causes

My calibration results

A personal view for individual evaluators:

Shows only your own calibration data
Useful for self-assessment
Same metrics as the Evaluator view

Switching views

Use the dropdown in the top-right corner to switch between:

Evaluator view
Scorecard view
My calibration results

Understanding calibration metrics

Outcome calibration performance

Measures how often the blind-scored overall outcome matches the calibrated outcome:

100% — Perfect alignment; all blind outcomes matched
Lower percentages — Indicates scoring discrepancies
Helps identify evaluators who may need additional training

Average outcome difference

Shows the average variance between blind and calibrated scores:

0 — No difference (perfect alignment)
Positive/negative values — Indicates typical over/under scoring
Useful for identifying systematic bias

Line item calibration performance

Measures alignment on individual scorecard criteria:

Shows percentage of line items scored consistently
More granular than outcome calibration
Helps identify specific criteria with interpretation issues

Root cause calibration performance

Measures agreement on root cause selections:

Applies when root causes are used in evaluations
Shows consistency in failure categorisation
Helps standardise problem identification

Filtering options

Customise the report with:

Date range — Period for calibration sessions
Scorecards — Specific evaluation forms to include
Evaluators — Specific evaluators to analyse
Other standard filters — As available for your account

Applying filters

Click the filter controls
Select your criteria
Click Run Report to update results

Report features

Column customisation

Click the column picker icon
Toggle columns on/off
Reorder columns as needed
Save your preferences

Downloading data

Click the Download link
Data exports as CSV
Includes all visible rows and columns

Drill-down to evaluator

In Evaluator view, click an evaluator name to see:

Individual session details
Line item performance breakdown
Specific calibration feedback

Interpreting results

Identifying training needs

Look for evaluators with:

Low outcome calibration percentages
High average outcome differences
Consistently low line item performance
Misaligned root cause selections

Scorecard issues

In Scorecard view, identify forms with:

Low overall calibration rates
Specific scorecards needing clarity
Line items with interpretation issues

Positive performance

Recognise evaluators with:

High calibration percentages
Consistent outcome alignment
Strong line item performance

Calibration session context

What is calibration?

Calibration sessions bring evaluators together to:

Score the same interactions independently ("blind")
Discuss and agree on correct scores ("calibrated")
Align on interpretation of criteria
Identify and resolve scoring discrepancies

Blind vs calibrated scores

Blind score — What the evaluator scored before discussion
Calibrated score — The agreed-upon score after discussion
Match — When blind equals calibrated
Difference — When blind differs from calibrated

Best practices

Regular review

Review monthly — Track calibration trends
After training — Measure the impact of coaching
Before busy periods — Ensure alignment

Using the data

Identify outliers — Focus on evaluators needing support
Spot patterns — Find common misinterpretation areas
Celebrate success — Recognise consistent evaluators

Action planning

Based on results:

Schedule additional calibration sessions
Provide targeted training
Clarify scorecard criteria
Update evaluation guidelines

Troubleshooting

No data showing

Verify calibration sessions exist in the date range
Check filter selections aren't too restrictive
Ensure calibration feature is enabled

Low calibration scores across all evaluators

May indicate scorecard clarity issues
Consider reviewing criteria definitions
Schedule group calibration sessions

Cannot access evaluator details

Check your permission level
Scoped view limits visibility
Contact administrator for full view access

Unexpected percentages

Verify the calculation basis (number of items)
Small sample sizes can skew percentages
Look at absolute numbers alongside percentages

How to use the Line Item Performance report

How to use the Calibration Performance report

How can team leaders use evaluagent to support their teams?

How to use the KPI report

How do I facilitate a calibration session?