
Tenki catches 2x more real bugs than any other AI reviewer. Here's the benchmark.
On May 19, 2026, GitHub shipped a quiet update to Copilot code review. The old Implement suggestion button became Fix with Copilot, with a new dialog for controlling how fixes get applied. More importantly, GitHub added Fix batch with Copilot: select multiple review comments, hand them to the Copilot cloud agent, and let it apply them all at once.
The feature is genuinely useful. It's also the final piece in a loop that should make engineering leads pause: Copilot writes the PR, Copilot reviews the PR, Copilot batch-applies its own review fixes, and a new commit lands. No human wrote a line of code at any step. The question isn't whether that loop works. It does. The question is what verifies that the batch-fix commit is actually safe to merge.
Two changes, both related to how developers act on Copilot's code review feedback.
The single-comment flow now opens a dialog when you click Fix with Copilot. You choose whether to apply the change directly to your PR branch or open a new PR. You pick which model Copilot uses. You can add extra instructions. Previously, this just tagged @Copilot in a comment and hoped for the best.
The batch flow is the bigger deal. Copilot's Pull Request Overview comment now has a Fix batch with Copilot button. You select which review comments to include, and the cloud agent addresses them all in a single pass. Instead of handling eight suggestions one at a time, you check the boxes, hit the button, and walk away.
Step back and look at what's now possible end to end with GitHub Copilot:
At no point did a human read the diff, run the tests locally, or verify that the batch of fixes didn't interact badly with each other. Each individual fix might be correct. The model applied exactly what the reviewer (also the model) asked for. But "correctly applied" and "safe to ship" aren't the same thing.
This distinction matters more than it looks. Consider what Copilot's review actually evaluates when it leaves a suggestion:
Batch application makes this worse, not better. When you apply eight fixes in a single commit, the blast radius of a bad interaction grows. Any two suggestions might individually pass review and collectively break the build. The model has no mechanism to detect this because detection requires execution, not inference.
The batch-fix commit is just another push to the PR branch. Your CI pipeline should trigger on it like any other commit. But "should" and "does" aren't always the same, especially when teams start treating Copilot's review approval as a green light.
A proper gate for batch-fix commits needs to answer three questions:
"The model reviewed it" is not the same statement as "CI verified it ran clean." They test different things. The model checks intent and code patterns. CI checks execution and integration. You need both.
The fix-review-fix loop works best when there's an independent checkpoint between "fixes applied" and "PR merged." That checkpoint needs to be outside the Copilot ecosystem so it isn't grading its own homework.
This is where a CI-integrated review tool earns its keep. When the batch-fix commit lands, the PR re-enters the CI pipeline. A tool like Tenki's code reviewer runs on the updated diff as part of that pipeline. It's reviewing the post-fix state, not the pre-fix state. If the batch of fixes introduced a null pointer path, a type mismatch, or a logic error that Copilot's review didn't catch, the independent reviewer flags it before the PR reaches a human approver.
The key property here is independence. Copilot reviewing its own fixes is like an author proofreading their own manuscript. They'll catch typos, but they'll miss structural problems because they already know what they meant to write. An independent reviewer doesn't share that blind spot.
If your team plans to use Fix batch with Copilot (and you should, it saves real time), here's how to keep the loop safe:
Fix batch with Copilot solves a real pain point. Handling review suggestions one at a time was slow, and the old workflow of tagging @Copilot in a comment was clunky. The new dialog gives you model selection and extra instructions. That's a meaningful improvement in developer control.
But the loop it closes has a gap. The AI wrote it, the AI reviewed it, the AI fixed it. At every step, the same system (or systems that share the same blind spots) evaluated the work. That's not a review process. That's a feedback loop with no external input.
The fix is straightforward: treat the batch-fix commit like any other push. Run CI. Run an independent review. Require a human sign-off. The automation speeds up the fix cycle. The gate ensures the fix cycle doesn't ship regressions.
Tags
Recommended for you
What's next in your stack.
GET TENKI