
MCP Security Scanning: Audit Your AI Agent's Tools
GitClear's January 2026 research dropped a number that made the rounds fast: developers who use AI tools throughout the day produce 4x to 10x more durable code than non-users. At face value, that looks like the strongest case yet for AI coding tools as a force multiplier. But the researchers themselves flagged a problem buried in their own data: the gap is too large to be explained by the tools alone.
They called it "dark matter productivity." Something else explains the chasm between AI power users and everyone else. And it matters because the answer changes how engineering leaders should invest in developer growth.
The study analyzed 2,172 developer-weeks of data pulled directly from Cursor, GitHub Copilot, and Claude Code APIs. This wasn't survey data or self-reported usage. GitClear measured actual AI tool engagement and correlated it with code output metrics across seven dimensions.
The headline finding: heavy AI users generate dramatically more durable code. But the same cohort also produced 9x more code churn than non-users. That's code written and then rewritten or deleted within weeks. More volume, yes. But also more waste.
GitClear's broader dataset tells a more measured story. Across 70,000 developer-years of data, median developer productivity increased just 9% from 2022 to 2025. Among developers averaging 500+ annual commits, the gain was 14.1%. Those aren't small numbers, but they're a long way from 4x.
The conclusion GitClear draws: the 4x productivity gap reflects truths about who is using AI most, not what the tools themselves create. Already-strong developers gravitate toward these tools, adopt them faster, and leverage them more effectively. The tools accelerate existing capability. They don't manufacture it.
If AI tools primarily attract and accelerate senior developers, that raises an uncomfortable question: what happens to the pipeline of developers who haven't built those skills yet?
Junior developers build expertise through repetition. Writing boilerplate teaches you what the boilerplate does. Debugging your own mistakes teaches you how systems fail. Refactoring teaches you how to recognize structural problems. These are boring, frustrating tasks. They're also how most engineers get good.
AI tools remove those repetitions. The JetBrains 2025 Developer Ecosystem survey found that the tasks developers most commonly delegate to AI are writing boilerplate, searching the internet, converting code between languages, writing documentation, and summarizing recent changes. Every one of those is a learning opportunity for a junior engineer that now gets skipped.
Stack Overflow's 2025 Developer Survey makes the concern concrete: 20% of developers say they've become less confident in their own problem-solving since adopting AI tools. Another 16.3% report that it's hard to understand how or why the AI-generated code works. These aren't abstract concerns. They're signals that developers are producing code they can't fully explain.
GitClear's earlier 2025 research adds structural evidence. Between 2021 and 2025, the percentage of code changes involving refactoring dropped from 25% to under 10%. Duplicated code rose from 8% to 18%. Developers aren't restructuring code anymore; they're copying it. That's a pattern consistent with accepting AI suggestions without deeply understanding the underlying architecture.
The pessimistic framing isn't the whole picture. There's a legitimate case that AI tools can accelerate learning when used intentionally.
Stack Overflow's own data supports this: 33.1% of developers report using AI mostly for learning new concepts or technologies, making it the third most common AI use case. And 63.2% of AI agent users agree that agents have accelerated their learning about new technologies or codebases. That's not nothing.
Pair programming with an LLM can work when the developer treats the AI as a sparring partner rather than a code generator. Asking "why did you structure it this way?" or "what would break if I changed this?" turns a code completion tool into a Socratic tutor. The problem is that this mode of interaction requires a learner's mindset and, critically, enough baseline knowledge to know which questions to ask.
That creates a paradox. AI works best as a learning tool for people who already know enough to learn from it. A mid-level developer exploring a new framework can ask targeted questions and evaluate the answers. A junior developer who doesn't yet understand dependency injection may accept whatever the model generates and move on, learning nothing about why it works.
The trust data backs this up. Only 3.1% of developers highly trust AI output. But experienced developers are the most skeptical: they distrust AI output at the highest rates because they have the context to spot when something's off. Junior developers often lack that calibration.
If you're an engineering leader evaluating AI tool ROI, lines of code and commit counts tell you almost nothing about developer growth. GitClear's own data demonstrates why: their power users generated massively more output and massively more churn. Volume without durability is just noise.
Here's a practical framework for measuring whether your team is actually growing, not just producing more.
Code comprehension: Can a developer explain, without aid, how their PR works and why they made specific design choices? If someone can only describe what the AI generated but not why, that's a red flag. Track this qualitatively in code reviews. Ask "walk me through this" regularly.
Debugging capability: How does a developer perform when the AI can't help? Production incidents, unfamiliar codebases, and time-sensitive bugs expose whether someone understands systems or just knows how to prompt. Watch how developers handle incidents. If they immediately reach for an AI tool rather than reading logs, stack traces, or source code, the skill gap is showing.
Architectural reasoning: Does the developer make sound structural decisions, or do they default to whatever pattern the model suggests? This shows up in design reviews, RFC discussions, and how someone responds when asked "what are the tradeoffs here?" AI tools are notoriously weak at cross-cutting architectural decisions, so this is where human judgment still matters most.
Code churn ratio: GitClear's research gives you a metric for this. Track the ratio of durable code to churn per developer. A developer whose AI-assisted output churns at 9x the normal rate isn't being productive; they're being busy. If your tooling supports it, compare churn rates pre- and post-AI adoption per individual.
Review quality given: This is an underrated signal. A developer who writes thoughtful review comments on others' PRs demonstrates comprehension that goes beyond their own output. If a junior engineer starts catching logic errors and suggesting structural improvements in reviews, they're growing. If their review comments are shallow or absent, the tool isn't building the skills you need.
If AI handles the first pass of writing code, then code review becomes the primary learning surface for developers. This requires a deliberate shift in how teams think about reviews.
Google's 2025 DORA research found that most lead time in software delivery is waiting, not building. Flow efficiency sits around 21%. A microservice fix might take five minutes of coding but five days to get through review. That waiting time is typically treated as waste, but if you reframe it, review is where the most valuable learning can happen.
Here's how to structure it:
The Stack Overflow data reveals a pattern that should concern engineering leaders thinking about team composition. AI tools show clear gains at the individual level: 52% of developers say AI has positively affected their productivity. But only 17% of AI agent users agree that agents have improved team collaboration. The tools help individuals go faster. They don't help teams get better.
That creates a tempting but dangerous optimization. If your strongest developers become 14% more productive with AI, and your junior developers show smaller or ambiguous gains, the short-term ROI calculation says to invest in tools for seniors and hire fewer juniors. Several companies are already making this bet.
The problem is obvious on a three-to-five-year timeline. If you stop developing junior engineers into senior ones, you eventually run out of senior engineers. And the market price for experienced developers who are also effective AI users will only go up. JetBrains found that 68% of developers already expect AI proficiency to become a job requirement. The developers who can use AI well and understand the code beneath it will be the scarcest, most expensive talent.
The better bet: invest in both. Give everyone the tools. But deliberately structure your development process so that using the tools doesn't bypass the learning. That means accepting some short-term velocity costs in exchange for a team that actually knows what it's building.
Don't let the 4x headline distract from what the data actually says. AI coding tools are valuable. They're also selection-biased toward people who were already good. If you only track output volume, you'll congratulate yourself on a productivity win while your team's depth quietly erodes.
Audit your metrics. If you're measuring developer impact solely through commits, PRs merged, or lines changed, add comprehension and durability signals. Track code churn. Watch how people perform in incidents. Make code review a structured learning exercise, not just a gate.
And protect the reps. The mundane work that AI makes easy to skip is the same work that builds engineers. Find a way to keep that in the rotation, even when it's slower. Your team in 2029 will thank you.
Tags
Recommended for you
What's next in your stack.
GET TENKI