AI Is Helping Developers Ship More Code Than Ever. The Quality Problem Is Getting Worse.

Software is being written faster than at any point in human history. And it’s starting to show — not in the way the industry hoped.

A Business Insider report lays out a troubling pattern emerging across the tech industry: AI coding assistants are dramatically accelerating the volume of code being produced, but the quality of that code isn’t keeping pace. More software shipped. More bugs shipped with it. The boom in AI-assisted development — driven by tools like GitHub Copilot, Cursor, and a growing roster of competitors — has created a productivity surge that looks incredible on dashboards and terrifying in production environments.

The numbers are hard to ignore. Developers using AI assistants report writing code two to three times faster than before. Google CEO Sundar Pichai said in late 2024 that more than a quarter of new code at Google is now generated by AI, with engineers reviewing and accepting it. That figure has almost certainly grown since. Microsoft, which owns GitHub, has reported similar acceleration across its engineering teams.

But speed and volume aren’t the same thing as quality. Not even close.

GitClear, a developer analytics firm, published research showing that code churn — the rate at which recently written code gets rewritten or deleted — has spiked significantly since AI coding tools became widespread. Their 2024 analysis found that “moved” and “copy/pasted” code increased while thoughtful refactoring declined. The implication is stark: developers are accepting AI-generated suggestions without fully understanding them, then circling back later to fix what doesn’t work. That’s not productivity. That’s deferred debugging.

So what’s actually happening on the ground? Engineers I’ve talked to describe a consistent pattern. AI tools are excellent at generating boilerplate, scaffolding basic functions, and autocompleting predictable patterns. They’re far less reliable when it comes to architecture decisions, edge cases, security considerations, and the kind of subtle logic that separates software that works from software that works well. The tools produce code that looks right. Often it compiles. Sometimes it even passes basic tests. But the failure modes are insidious — the kind of bugs that don’t surface until they’re in front of real users doing unpredictable things.

This creates a new kind of technical debt. Fast-accumulating and hard to spot.

The Business Insider piece highlights growing concern among engineering leaders that their teams are shipping more features while simultaneously degrading their codebases. One source described it as “running faster toward a cliff.” The pressure to adopt AI tools is immense — partly from executives who see the productivity metrics, partly from developers themselves who don’t want to fall behind. But the organizational incentives are misaligned. Teams get rewarded for velocity. Nobody gets promoted for saying, “I slowed down and wrote fewer lines of better code.”

And the junior developer problem is real. Engineers early in their careers are learning to code alongside AI from day one, which means they’re sometimes accepting generated output they can’t fully evaluate. That’s a skills gap that compounds over time. Senior engineers have the context to reject bad suggestions. Juniors often don’t know what they don’t know. As Stack Overflow’s blog noted, AI didn’t produce 10x developers — in some cases, it created developers who ship ten times as much code they can’t maintain.

There’s a counterargument, of course. Proponents say the tooling is improving rapidly, that better models will generate better code, and that the current quality issues are transitional. That’s plausible. GPT-4 produces notably better code than GPT-3.5 did. Claude, Gemini, and newer models continue to improve on benchmarks. But benchmarks aren’t production systems, and the gap between “works on a coding challenge” and “works in a complex, interdependent codebase” remains wide.

Testing infrastructure hasn’t caught up either. If AI can generate code three times faster, you need testing that can validate three times faster — and most organizations don’t have that. Manual code review becomes a bottleneck. Automated test suites, many of which were already insufficient, now face an even larger surface area of generated code to cover.

The companies that will come out ahead are the ones treating AI coding tools as accelerators for experienced engineers rather than replacements for engineering judgment. That means investing in code review processes, maintaining high standards for test coverage, and resisting the temptation to conflate output volume with progress.

More code has never meant better software. That was true before AI, and it’s even more true now. The industry is producing an unprecedented quantity of software. Whether any of it holds up — that’s the question nobody’s answering fast enough.

AI Is Helping Developers Ship More Code Than Ever. The Quality Problem Is Getting Worse.

Notice an error?

Ready to get started?