Question 1

What is the Reviews module in Evalgent?

Accepted Answer

Reviews is Evalgent's human-in-the-loop layer for AI voice agent evaluations. Engineering and QA teams inspect failed scenarios, appeal LLM-judged verdicts they disagree with, correct metric outcomes, and tag failure modes. The full audit trail of who reviewed what, when, and why stays attached to each scenario run.

Question 2

When should I appeal an LLM-judged evaluation verdict?

Accepted Answer

Appeal when the LLM scored a scenario as failed but you believe the call actually met the success criterion — or vice versa. The most common case: the LLM penalized a tone variation the rubric should have accepted. Each appeal updates the scoring model for that metric so the same edge case scores correctly next time.

Question 3

How does the audit trail work in Evalgent?

Accepted Answer

Every evaluation run, every verdict, every appeal, and every metric change in Evalgent is logged with timestamp and user. Reviews exposes this trail so engineering teams can trace why a scenario succeeded or failed, who challenged a verdict, and what changed. Regulated voice deployments need this. Audit-ready by default.

Question 4

Does appealing an LLM verdict change historical metrics?

Accepted Answer

Appealed verdicts update the affected metric for that scenario run but preserve the historical record. Aggregate scenario success rate recalculates if the appeal flips the verdict. The original LLM verdict, the appeal, and the resolution are all visible in the audit trail. Voice agent QA stays defensible.

Question 5

Who can review LLM verdicts in Evalgent?

Accepted Answer

Any team member with reviewer permissions can submit an appeal. Senior reviewers approve or reject. Most teams configure engineering for the technical metrics, QA for the qualitative ones, and product for borderline cases. Permissions and reviewer assignments stay configurable per project.

Challenge LLM judgements with human-in-the-loop reviews

Refund reason collected

What is a voice agent review?

How does the review process work?

Flag a judgement

Submit your appeal

Get a decision

What you get back

Corrected outcomes

Recalculated metrics

Audit trail

The difference human reviews make

Trust the LLM blindly

Human-corrected accuracy

Frequently asked questions

Explore more

Scenarios

Profiles

Metrics

Know if your voice agent is ready for production