Code Review Engine

Oversight's code review system dispatches multiple specialized AI agents in parallel to review pull requests with deterministic, comprehensive coverage. Every modified function is traced end-to-end across six distinct review perspectives.

Architecture

Reviews are executed by a local runner daemon that spawns Claude Code with the full pr-review prompt. The reviewer clones the target repository, fetches the PR ref into an isolated git worktree, and launches the multi-agent review process with direct access to the source code.

This architecture means the review has full codebase context — it reads entire files, traces imports, checks test coverage, and understands how changes interact with existing code. It is not limited to reviewing just the diff.

The 6 Agent Types

Every review dispatches all six agents in parallel. Each agent focuses on a specific dimension of code quality:

Agent	Focus
code-reviewer	Reviews all modified source files, test files, and CI/packaging configuration. Traces data flows, validates error handling, checks edge cases, verifies correctness of business logic, and ensures version consistency across workflows and package manifests.
silent-failure-hunter	Specifically looks for failures that would be swallowed silently — empty catch blocks, ignored promise rejections, missing error propagation, and fallback values that mask bugs.
pr-test-analyzer	Evaluates test coverage for the changes. Checks whether new code paths have tests, whether existing tests are updated for changed behavior, and whether test assertions are correct.
type-design-analyzer	Reviews TypeScript type safety, interface design, generic constraints, and type narrowing. Catches `as any` casts, incorrect type predicates, and incomplete discriminated unions.
comment-analyzer	Reads PR comments and review threads for prior context. Verifies that previously raised issues were addressed, and identifies new concerns from discussion that the other agents should be aware of.
code-simplifier	Identifies opportunities to reduce complexity — redundant abstractions, over-engineered patterns, dead code, and unnecessary indirection. Suggests simplifications that preserve behavior while improving readability and maintainability.

All agents run on Claude Opus. Oversight never uses Haiku or Sonnet for reviews — every agent gets the most capable model to ensure thorough analysis.

Strategy Selection

The review engine automatically selects the appropriate strategy based on diff characteristics:

Standard 6-Agent CR

Used when the diff is under 2,000 lines and affects 10 or fewer files. All six agents receive the full diff and complete review manifest, then report findings in parallel.

MSAL (Module-Scoped Agent Loops)

Used when the diff exceeds 2,000 lines and the changed files decompose into independent modules. MSAL runs in three phases:

Phase 1 — Module-Scoped Loops: Up to 6 agents run in parallel, each scoped to a single module's files. Each agent reads, reviews, and loops until it reports clean. Modules are prioritized by size.
Phase 2 — Cross-Cutting Review: After all module agents finish, 2-3 integration agents check API surface consistency, cross-module data flows, and CI/packaging.
Phase 3 — Consolidation: All findings are merged, deduplicated, and assigned final severity.

Standard 6-Agent CR with Focused Rounds

Used when the diff exceeds 2,000 lines but lacks clear module boundaries. The standard six agents run, but with focused review rounds that prioritize high-risk areas.

Review Manifest

Before dispatching agents, the orchestrator builds a review manifest — an explicit checklist of every modified file, function, method, and class. This manifest is included in every agent's prompt, and agents are required to report on every item. This ensures deterministic, reproducible coverage across runs.

Example Review Manifest text

Modified files:
  src/runner.ts
    - executeResolve()
    - spawnClaude()
  src/reviewer.ts
    - launchReview()
    - buildCLIArgs()
  src/engagement.ts
    - computeEngagementSignals()

CI/packaging:
  .github/workflows/ci.yml
  package.json

Orchestrator Pre-Audit

Before dispatching any review agents, the orchestrator performs its own audit. Agents are a confirmation layer, not a discovery layer. The pre-audit traces:

Every input — where it comes from, what types and values are possible
Every output — where it goes, who consumes it, what they expect
Every error path — does it surface or swallow?
Pre-existing code in modified files — the entire file is read, not just the diff

Git Worktree Isolation

Each review runs in its own git worktree, created from the PR's head ref. This means multiple reviews can run concurrently without interference — each agent has a fully isolated working directory with the PR's code checked out.

Worktrees are cleaned up after each review. If a stale worktree from a previous run blocks a fetch, the reviewer automatically prunes dead worktrees and retries.

Output Format

Reviews are consolidated into a single structured markdown document with these sections:

Critical Issues — bugs, security vulnerabilities, data loss risks
Important Issues — correctness problems, missing error handling, type safety gaps
Suggestions — code quality improvements, simplifications, style recommendations
Strengths — what the PR does well
Recommended Action — approve, request changes, or needs discussion

All agent attribution is stripped from the final output — findings stand on their own without any indication of which agent produced them.

Post-Review Actions

After a review is generated, you can:

Amend — ask Claude to revise specific parts of the review
Save — persist edits and sync to Notion
Post to GitHub — publish as a PR review comment, with options for COMMENT, REQUEST_CHANGES, or APPROVE
Resolve — hand off to the resolve workflow to automatically fix all findings

Triggering a Review

API Request http

POST /api/tasks
Content-Type: application/json

{
  "repoId": 1,
  "itemNumber": 42,
  "taskType": "review"
}

If a review is already queued or running for the same PR, the API returns the existing task instead of creating a duplicate. Progress is streamed via Supabase Realtime — the web UI shows live output as the review runs.