Code Review Engine
Oversight's code review system dispatches multiple specialized AI agents in parallel to review pull requests with deterministic, comprehensive coverage. Every modified function is traced end-to-end across six distinct review perspectives.
Architecture
Reviews are executed by a local runner daemon that spawns Claude Code with the full
pr-review prompt. The reviewer clones the target repository, fetches the
PR ref into an isolated git worktree, and launches the multi-agent review process
with direct access to the source code.
This architecture means the review has full codebase context — it reads entire files, traces imports, checks test coverage, and understands how changes interact with existing code. It is not limited to reviewing just the diff.
The 6 Agent Types
Every review dispatches all six agents in parallel. Each agent focuses on a specific dimension of code quality:
| Agent | Focus |
|---|---|
| code-reviewer | Reviews all modified source files, test files, and CI/packaging configuration. Traces data flows, validates error handling, checks edge cases, verifies correctness of business logic, and ensures version consistency across workflows and package manifests. |
| silent-failure-hunter | Specifically looks for failures that would be swallowed silently — empty catch blocks, ignored promise rejections, missing error propagation, and fallback values that mask bugs. |
| pr-test-analyzer | Evaluates test coverage for the changes. Checks whether new code paths have tests, whether existing tests are updated for changed behavior, and whether test assertions are correct. |
| type-design-analyzer | Reviews TypeScript type safety, interface design, generic constraints, and type narrowing. Catches as any casts, incorrect type predicates, and incomplete discriminated unions. |
| comment-analyzer | Reads PR comments and review threads for prior context. Verifies that previously raised issues were addressed, and identifies new concerns from discussion that the other agents should be aware of. |
| code-simplifier | Identifies opportunities to reduce complexity — redundant abstractions, over-engineered patterns, dead code, and unnecessary indirection. Suggests simplifications that preserve behavior while improving readability and maintainability. |
Strategy Selection
The review engine automatically selects the appropriate strategy based on diff characteristics:
Standard 6-Agent CR
Used when the diff is under 2,000 lines and affects 10 or fewer files. All six agents receive the full diff and complete review manifest, then report findings in parallel.
MSAL (Module-Scoped Agent Loops)
Used when the diff exceeds 2,000 lines and the changed files decompose into independent modules. MSAL runs in three phases:
- Phase 1 — Module-Scoped Loops: Up to 6 agents run in parallel, each scoped to a single module's files. Each agent reads, reviews, and loops until it reports clean. Modules are prioritized by size.
- Phase 2 — Cross-Cutting Review: After all module agents finish, 2-3 integration agents check API surface consistency, cross-module data flows, and CI/packaging.
- Phase 3 — Consolidation: All findings are merged, deduplicated, and assigned final severity.
Standard 6-Agent CR with Focused Rounds
Used when the diff exceeds 2,000 lines but lacks clear module boundaries. The standard six agents run, but with focused review rounds that prioritize high-risk areas.
Review Manifest
Before dispatching agents, the orchestrator builds a review manifest — an explicit checklist of every modified file, function, method, and class. This manifest is included in every agent's prompt, and agents are required to report on every item. This ensures deterministic, reproducible coverage across runs.
Modified files:
src/runner.ts
- executeResolve()
- spawnClaude()
src/reviewer.ts
- launchReview()
- buildCLIArgs()
src/engagement.ts
- computeEngagementSignals()
CI/packaging:
.github/workflows/ci.yml
package.json
Orchestrator Pre-Audit
Before dispatching any review agents, the orchestrator performs its own audit. Agents are a confirmation layer, not a discovery layer. The pre-audit traces:
- Every input — where it comes from, what types and values are possible
- Every output — where it goes, who consumes it, what they expect
- Every error path — does it surface or swallow?
- Pre-existing code in modified files — the entire file is read, not just the diff
Git Worktree Isolation
Each review runs in its own git worktree, created from the PR's head ref. This means multiple reviews can run concurrently without interference — each agent has a fully isolated working directory with the PR's code checked out.
Worktrees are cleaned up after each review. If a stale worktree from a previous run blocks a fetch, the reviewer automatically prunes dead worktrees and retries.
Output Format
Reviews are consolidated into a single structured markdown document with these sections:
- Critical Issues — bugs, security vulnerabilities, data loss risks
- Important Issues — correctness problems, missing error handling, type safety gaps
- Suggestions — code quality improvements, simplifications, style recommendations
- Strengths — what the PR does well
- Recommended Action — approve, request changes, or needs discussion
All agent attribution is stripped from the final output — findings stand on their own without any indication of which agent produced them.
Post-Review Actions
After a review is generated, you can:
- Amend — ask Claude to revise specific parts of the review
- Save — persist edits and sync to Notion
- Post to GitHub — publish as a PR review comment, with options for COMMENT, REQUEST_CHANGES, or APPROVE
- Resolve — hand off to the resolve workflow to automatically fix all findings
Triggering a Review
POST /api/tasks
Content-Type: application/json
{
"repoId": 1,
"itemNumber": 42,
"taskType": "review"
}
If a review is already queued or running for the same PR, the API returns the existing task instead of creating a duplicate. Progress is streamed via Supabase Realtime — the web UI shows live output as the review runs.