# AI Coding Agent PR Review Rubric

A practical rubric for engineers and leads reviewing coding-agent pull requests after the agent runs and before the work merges.

## Core rule

Review the agent's authority, not just the final diff. A coding-agent PR is merge-ready only when the approved intent, tool use, validation evidence, and human owner are all visible.

## How to score

Give each section 0-4 points:

- 0: missing
- 1: weak
- 2: partial
- 3: clear
- 4: strong

Score bands:

- 0-8: Do not merge yet. The PR may contain useful work, but reviewers cannot yet see intent, authority, validation, or ownership clearly enough.
- 9-16: Needs reviewer follow-up. The change is probably reviewable, but at least one important dimension needs stronger evidence before merge.
- 17-20: Review-ready. The PR connects approved intent, bounded authority, validation evidence, and human accountability well enough for normal code review.

## 1. Approved intent matches the diff

Can a reviewer connect the final PR back to the pre-execution Goal Contract?

- [ ] The PR states the approved outcome in plain language, not just the agent transcript summary.
- [ ] Changed files stay inside the agreed blast radius or clearly call out where scope expanded.
- [ ] The author explains any agent-initiated detours, retries, or abandoned paths that affected the final diff.

## 2. Authority and tool use are legible

Can the team see what authority the coding agent actually used?

- [ ] The PR handoff lists high-effect tools used: shell commands, MCP writes, external sends, migrations, deploy steps, or credentialed APIs.
- [ ] Permission expansions are linked to a human approval note instead of being buried in the transcript.
- [ ] Untrusted context such as issues, webpages, docs, and comments did not expand agent authority.

## 3. Validation evidence is reviewable

Can a reviewer verify success without rerunning the entire agent session mentally?

- [ ] The PR includes targeted test, lint, typecheck, build, or manual verification output relevant to the change.
- [ ] Skipped checks are labeled as blockers or intentional non-applicable checks, not treated as implicit passes.
- [ ] Validation covers both the happy path and the failure mode the agent was asked to address.

## 4. Human ownership is explicit

Is a human taking responsibility for what merges?

- [ ] A human author states what they reviewed in the diff, generated content, and tool output.
- [ ] Risky areas such as auth, billing, secrets, data migrations, and production behavior get owner review before merge.
- [ ] The final PR description separates agent-generated claims from evidence the team can inspect.

## 5. Future workflow signal is captured

Did this PR teach the team anything about better agent workflow defaults?

- [ ] Repeated agent confusion, permission asks, or validation failures are captured as template or policy follow-up.
- [ ] The reviewer notes whether this workflow should become low-friction, stay human-gated, or be redesigned.
- [ ] The PR leaves behind reusable examples for similar future coding-agent work.

## Use this with the pre-run approval

The strongest review loop starts before the agent runs. Draft a Goal Contract in Caskade, approve the outcome and blast radius, then use this rubric to compare the final PR against the approved plan. Current beta generation is sign-up gated; limited anonymous generation is a later roadmap item.

Start planning: https://app.caskade.dev/plan

Bring a real workflow: https://caskade.dev/?utm_source=resource&utm_medium=markdown&utm_campaign=agent-pr-review-rubric#access
