Deep Review: AI That Explores Your Entire Codebase Before Reviewing Your PR
Most AI code review tools only scan the diff. Deep Review reads your full project — files, configs, tests, dependencies — and catches cross-file bugs that diff-only tools miss. Here's how it works.
Tired of slow code reviews? AI catches issues in seconds. You decide what gets published.
Every AI code review tool has the same blind spot
They all do the same thing: scan the diff, generate comments, post them to your PR.
The diff is a narrow window. It shows what changed. Not what that change breaks three directories away.
Here's a scenario we've all lived through. A developer renames a utility function. The diff looks fine, clean refactor, consistent naming. But the function is imported in 14 other files. Three of those imports now point to nothing. The tests still pass because those code paths aren't covered. Production breaks on Tuesday.
A diff-only tool would never flag this. The renamed file wasn't in the diff. The broken imports weren't in the diff. The missing test coverage wasn't in the diff. Nobody looked.
We built Deep Review to close that gap.
So what is Deep Review, actually?
It's an agent mode in Git AutoReview that uses Claude Code CLI to walk through your full codebase before reviewing a pull request.
Instead of feeding a diff to an LLM and hoping for useful comments, Deep Review spins up an agent that:
- Reads the PR diff to understand what changed
- Opens related files (imports, configs, tests, types, build scripts)
- Traces data flow across modules looking for broken connections
- Runs your linter on affected files
- Checks whether the changed code paths actually have tests
- Produces findings with severity ratings, file references, and fix suggestions
The agent doesn't guess from a context window. It opens your files, follows your imports, reads your tests. Think of it like the difference between looking at a floor plan and actually walking through the building.
Where diff-only review falls apart
I want to be concrete here. These failure modes affect every diff-based tool: CodeRabbit, GitHub Copilot, Qodo Merge, any custom GPT wrapper you've duct-taped together.
Cross-file dependency breaks
You rename utils/formatDate.ts to utils/formatDateTime.ts. The diff shows a clean rename. But formatDate is imported in OrderConfirmation.tsx, InvoiceGenerator.ts, and EmailTemplate.tsx. None of those files are in the diff. The diff-only tool sees a perfectly good rename and moves on.
Hardcoded secrets in untouched files
Your PR adds a new API endpoint. The review focuses on the controller. Meanwhile, staging.env has an AWS key that was committed six months ago and nobody noticed. A diff-only tool never opens staging.env because it wasn't changed in this PR. Why would it?
Data flow vulnerabilities
A PR modifies input sanitization in the request handler. The diff looks secure — proper escaping, parameterized queries. But the sanitized value gets passed to a downstream function in another file that re-concatenates it into a raw SQL string. The vulnerability isn't in the diff. It's in the path the data takes afterward.
Architecture drift
A developer adds a caching layer to a service. Looks reasonable in isolation. But the system uses eventual consistency, and the caching introduces a race condition that only shows up if you read the architecture docs and the event handlers in a different module.
Missing test coverage
The PR adds a new feature. 200 lines of code. The tests pass. But there are zero tests for the new code — the existing tests cover old paths, nobody wrote new ones. A diff-only tool sees "tests pass" and gives a thumbs up.
Deep Review would have caught every one of these. The only difference is that it actually opens the files.
How it works under the hood
When you trigger a Deep Review, here's the sequence:
Diff analysis. The agent reads the full PR diff to build a map of what changed — which files, which functions, which lines. Starting point, not the whole picture.
Dependency mapping. It follows imports, requires, and type references from changed files outward. UserController.ts imports from AuthService.ts? The agent opens AuthService.ts. That imports from TokenStore.ts? Opens that too. It builds a dependency graph around the PR.
Codebase exploration. Based on that graph, the agent opens relevant files — environment configs, build scripts, test files, type definitions. It reads your actual project, not a truncated context window.
Linter execution. Runs your project's linter (ESLint, Pylint, whatever you've got) against changed files and their immediate dependencies. Lint results inform the review but don't dominate it.
Test analysis. Checks which test files cover the changed code. Reads the tests to understand what's actually tested. If a new function has zero coverage, flags that.
Finding generation. Produces findings, each with a severity level (Critical through Info), file references, explanation, and fix suggestion.
You see all of this in a real-time activity log inside VS Code. When the agent finishes, you review each finding and decide what to publish.
Watching the agent think
This is the part I like most about Deep Review compared to cloud tools: you can watch it work.
The activity log shows every step:
[Agent] Reading PR diff... 12 files changed, 847 lines
[Agent] Opening src/services/AuthService.ts (imported by UserController)
[Agent] Opening src/config/database.ts (referenced in AuthService)
[Agent] Opening tests/auth.test.ts (test file for AuthService)
[Agent] Running ESLint on 4 changed files...
[Agent] Found: database.ts uses connection string from env without validation
[Agent] Checking test coverage for new handleRefresh() method...
[Agent] No tests found for handleRefresh — flagging as coverage gap
[Agent] Generating findings... 4 issues found (1 High, 2 Medium, 1 Info)
When a cloud tool tells you "this line might have an issue," you either trust it or you don't. With the activity log, you see what the agent read. If a finding seems off, you trace back through the log and figure out where it went wrong. Or where you went wrong.
Real findings from production codebases
Five things Deep Review has actually caught:
1. Hardcoded secrets across config files
Severity: CRITICAL
The agent opened 3 config files and found API keys in staging.env that weren't in .gitignore. The changed file in the PR was a controller. A diff-only review saw nothing.
2. Broken dependency path after refactor
Severity: HIGH
A config file referenced a build hook that no longer existed after a rename. The agent traced the path through package.json, tsconfig.json, and the build script to find the dead reference.
3. Error handler silently swallowing failures
Severity: HIGH
A try-catch in a service file caught all exceptions and logged them but never re-threw or returned an error. The caller assumed success. The agent found it by tracing the error path from controller through service to database layer.
4. Missing type validation on API boundary
Severity: MEDIUM
The API accepted userId as a string but passed it to a database query expecting a number. TypeScript types were correct at each layer individually, but the runtime conversion wasn't handled. The agent found the mismatch by reading the route handler, service layer, and database query together.
5. Test file testing the wrong function
Severity: MEDIUM
After a refactor, a test file still imported and tested an old version of a function. Tests passed because the old function still existed (marked deprecated). The agent compared test imports against current module exports and flagged the staleness.
Quality assessment across six areas
Beyond individual findings, Deep Review scores your PR across categories:
| Category | What it checks |
|---|---|
| Security | Secrets, injection points, auth gaps, input validation |
| Architecture | Separation of concerns, dependency direction, coupling |
| Error handling | Try-catch coverage, error propagation, fallback logic |
| Type safety | Runtime type mismatches, any-casting, boundary validation |
| Test coverage | Coverage gaps, dead tests, assertion quality |
| Code standards | Linter compliance, naming conventions, dead code |
Each category gets a score and a summary of what needs attention. Your senior devs still do the real review — they just don't have to waste time catching mechanical stuff anymore.
How it compares
| Feature | Git AutoReview Deep Review | CodeRabbit | Greptile | GitHub Copilot |
|---|---|---|---|---|
| Analysis scope | Full codebase | Full codebase (cloud sandbox) | Full codebase (indexed) | Diff only |
| Runs where | Locally in VS Code | Cloud | Cloud | Cloud |
| Activity log | Real-time in VS Code | No | No | No |
| Human approval | Required before publishing | Optional | Optional | No |
| Multi-model | Claude, Gemini, GPT (BYOK) | Fixed | Fixed | Fixed (Copilot) |
| Platforms | GitHub, GitLab, Bitbucket | GitHub, GitLab, Bitbucket, Azure | GitHub, GitLab | GitHub only |
| Review time | 5-25 min | 2-10 min | 2-5 min | 30 sec |
| Pricing | From $9.99/mo + Claude sub | $24-30/user/mo | $30/user/mo | $10-39/user/mo |
| BYOK | Yes | No | No | No |
The trade-off is obvious: Deep Review is slower because it does more. Need a fast linter-level check? Use Quick Review mode (API-based, 15-30 seconds). Need the kind of review a senior engineer would do with coffee on a quiet Sunday? Deep Review.
When to use which mode
Git AutoReview gives you both, and they're built for different situations.
Quick Review makes sense for small PRs under 100 lines, routine changes like dependency bumps or formatting fixes, and when you're batch-reviewing a pile of PRs and need quick feedback.
Deep Review is for the PRs that keep you up at night. Large refactors touching business logic. New features with cross-cutting concerns. Security-sensitive changes. Anything going to main or production that you really don't want to break.
In practice, most teams run Quick Review on about 80% of their PRs and Deep Review on the 20% that actually matter. Review Profiles let you set this up once and switch with one click.
Setup takes about 5 minutes
-
Install Git AutoReview from the VS Code marketplace.
-
Install Claude Code CLI and sign in with your Anthropic account (Pro or Max subscription required).
-
Switch to "Deep Review" mode in the Git AutoReview sidebar. The extension detects Claude Code automatically.
-
Open a PR, click Review, watch the activity log. When it finishes, review findings and publish what you agree with.
No CI/CD changes. No GitHub App. No cloud config. Everything runs on your machine.
The honest trade-offs
We'd rather tell you the downsides upfront than have you find out and feel misled.
It's slow. 5 to 25 minutes per review, depending on project size. For quick feedback, use Quick Review mode instead.
The cost is higher too. Claude Pro ($100/mo) or Max ($200/mo) subscription on top of your Git AutoReview plan. For teams reviewing critical code daily, makes total sense. Solo devs reviewing small PRs? Quick Review with Gemini or Haiku is a better deal.
The agent is thorough but not omniscient. It catches real issues, and it still misses things a human with deep domain knowledge would spot. Strong first pass. Not a replacement for your team.
And it only works with Claude Code CLI. Quick Review works with any model (Claude, Gemini, GPT) via standard API keys, so you're not locked in.
We think being straight about this builds more trust than pretending it's magic. Deep Review is the best option available today for catching cross-file issues in code review. It's not the fastest or cheapest. But when a PR touches critical infrastructure, 15 minutes of thorough analysis beats 15 seconds of diff scanning.
What developers are saying
"Claude Opus gives the best review comments, but for day-to-day reviews I use Gemini and Haiku — great price/performance balance." — Camilo H., Software Developer
"The AI catches things I would have missed, and I love that I can review everything before it gets published." — Viktor B., Sr. Software Architect
"It catches the things you'd normally miss yourself." — Jason O., Sr. Software Engineer
Try it
Deep Review is available now in Git AutoReview for VS Code.
- Free plan: 10 reviews/day, includes Deep Review
- Developer ($9.99/mo): 100 reviews/day, 10 repos
- Team ($14.99/mo): Unlimited reviews, team features
All plans support Deep Review. You just need Claude Code CLI installed separately.
Every finding requires your approval before it reaches your PR. AI suggests. You decide.
Tired of slow code reviews? AI catches issues in seconds. You decide what gets published.
Frequently Asked Questions
Try it on your next PR
AI reviews your code for bugs, security issues, and logic errors. You approve what gets published.
Free: 10 AI reviews/day, 1 repo. No credit card.
Related Articles
AI Code Review for Java: Tools, Virtual Threads & Setup (2026)
SpotBugs and PMD catch patterns. AI catches the logic errors they miss. We tested traditional Java tools vs AI reviewers on real PRs, including Java 21 virtual thread bugs that no static analyzer detects.
AI Code Review Pricing Comparison 2026: Real Costs for Teams of 5-50
We calculated real monthly costs for 6 AI code review tools at team sizes of 5, 10, 20, and 50. Per-user pricing vs flat rate vs BYOK. Hidden costs included: API overages, per-seat scaling, self-hosted infrastructure.
How to Use Claude Code for AI Code Reviews in VS Code
Claude Code is the most-loved AI coding tool. Here's how to use it for code reviews — the manual way, the automated way with Git AutoReview, and when each approach makes sense.
Get the AI Code Review Checklist
25 traps that slip through PR review — with code examples. Plus weekly code review tips.
Unsubscribe anytime. We respect your inbox.