Introducing Tenki's code reviewer: deep, context-aware reviews that actually find bugs.Try it for Free
Code Review
Jun 2026

GitHub PR Coverage Is Useful. Here's What It Can't Tell You.

Eddie Wang
Eddie Wangengineering

Share Article:

GitHub shipped native code coverage on pull requests yesterday. Public preview, available on Enterprise Cloud and Team plans, free during the preview period. You add the upload-code-coverage action to your CI workflow, grant the code-quality:write permission, and GitHub posts an aggregate coverage percentage as a comment on every PR.

It's a genuinely useful addition. Coverage as a first-class PR signal means reviewers don't have to context-switch to Codecov or a third-party dashboard just to see whether the new code has tests. But there's a gap between "this code was executed by a test" and "this code is correct," and that gap is where bugs live.

What GitHub's coverage feature actually does

The setup is straightforward. Your CI generates a Cobertura XML report (most test frameworks already support this), and the upload-code-coverage action sends it to GitHub. GitHub then posts a comment on the PR with the aggregate line coverage percentage.

That number tells you one thing: what percentage of lines in the changed files were executed during the test run. If a PR touches 200 lines and 160 of them are hit by some test, coverage is 80%. The reviewer sees this at a glance without leaving the PR.

That's valuable. A PR with 12% coverage probably deserves a harder look than one with 90%. As a triage signal, coverage works.

Three things a coverage percentage can't tell you

Coverage measures execution, not correctness. Here's where the distinction matters in practice.

1. Whether the tests assert anything meaningful

A test that calls a function and never checks the return value still counts as coverage. So does a test that asserts expect(result).toBeDefined() on an object that should have been null. The test ran the code. The coverage report says 100%. The test proved nothing.

This isn't a theoretical problem. Mutation testing research consistently shows that test suites with high line coverage often fail to detect injected faults. Coverage tells you the test suite visited the code. It doesn't tell you the test suite would catch a regression.

2. Whether the covered code has logic errors

Consider a function that calculates a discount. The test passes in a standard order and checks the result. Coverage: 100%. But the function uses > instead of >= in a boundary check, so orders exactly at the threshold get the wrong price. The test doesn't exercise that boundary. Coverage can't see the off-by-one.

More broadly, coverage can't evaluate intent. It doesn't know what the code was supposed to do, only that something executed it.

3. Whether error paths and security-sensitive branches are safe

An authentication handler might have 85% coverage because the happy path is well-tested. But the catch block that handles a malformed JWT? It logs the error and returns a 200 instead of a 401. Coverage sees the catch block as "uncovered" (or worse, sees it as covered if a different test triggered it incidentally), but it can't flag that the error handling itself is wrong.

Security-sensitive code needs more than execution confirmation. It needs someone (or something) reading the logic and asking: "Does this actually reject the request it should reject?"

What Tenki's review layer reads instead

Tenki's code reviewer doesn't look at coverage metrics. It reads the actual implementation in every PR: the diff, the surrounding context, and the codebase it sits in. Then it flags problems that coverage scores mask.

Where coverage answers "did a test run this line?", Tenki answers a different set of questions:

  • Is this logic correct? Off-by-one errors, incorrect operator usage, wrong variable references, inverted conditionals.
  • Are edge cases handled? Null inputs, empty arrays, concurrent access, boundary values.
  • Are error paths safe? Catch blocks that swallow errors, missing status codes, leaked resources on failure.
  • Does this fit the existing codebase? Inconsistent patterns, duplicated logic, naming that breaks conventions.

The setup takes about two minutes: install the GitHub App, connect your repositories, and reviews start on your next PR. You can configure severity thresholds, set custom rules that match your team's conventions, and adjust verbosity so the reviewer focuses on what matters to you.

In benchmarks against six other AI reviewers on 122 real production bugs, Tenki detected 69% of issues. The next closest was 36%. That difference comes from reading the implementation itself rather than relying on surface-level metrics.

Coverage + review: two signals, one gate

The best use of GitHub's coverage feature isn't as a standalone merge gate. It's as one signal alongside an actual code review. Here's a practical workflow for teams that want both:

  1. CI runs tests and generates a Cobertura report. The upload-code-coverage action posts the aggregate percentage on the PR.
  2. Tenki reviews the PR automatically. It reads the diff, checks for logic errors, edge case gaps, and security concerns, and posts inline comments.
  3. The human reviewer gets both signals. Coverage tells them whether the new code has tests at all. Tenki tells them whether the code (tested or not) has problems.
  4. Merge with confidence. The coverage number handled the "is this tested?" question. The AI review handled the "is this correct?" question. The human reviewer can focus on architecture, design, and product intent.

This isn't about replacing one tool with the other. Coverage and code review measure fundamentally different things. A PR with 95% coverage and a critical logic error in the covered path is a real scenario that happens all the time. Without something reading the actual code, that error sails through.

The right way to think about it

GitHub adding coverage to PRs is a good move. It closes a real gap: too many teams had to leave the PR page to check test coverage, and a lot of teams just didn't bother. Having that number right on the PR is strictly better than not having it.

But coverage answers "what ran?" and review answers "what's wrong?" They're orthogonal. If your merge gate only checks one, you're flying half-blind.

If you're already using GitHub Actions for CI, adding both takes about five minutes total. Enable coverage in your workflow, install Tenki's code reviewer, and your next PR gets both signals before a human ever opens it.

Tags

#code-coverage#copilot-code-review

Recommended for you

What's next in your stack.