# Evaluation release gate

Use this release gate before production changes to models, prompts, retrieval indexes, tools, or orchestration logic.

## Gate evidence

| Dimension | Evidence |
| --- | --- |
| Task success | Golden task pass rate and reviewer notes |
| Grounding | Citation precision, unsupported claims, refusal quality |
| Safety | Sensitive data, tool authority, policy violations |
| Performance | Latency, cost, fallback, timeout, and retry behavior |
| Regression | Known failures, accepted regressions, and mitigations |
| Rollback | Version, owner, trigger, and recovery steps |

## Decision options

- Ship with no material regression.
- Ship with named residual risk and compensating control.
- Hold for remediation.
- Roll back because runtime evidence contradicts release assumptions.

## Output

A launch decision that ties approval to evidence, not model demos or subjective confidence.
