I was recently observing a team doing their day-to-day work. Their C-suite had introduced an “AI-first” policy over the summer, mandating that development teams use “AI” as much as possible on their code.
Starting in November, this mandate turned into a KPI for individual developers, and for teams: % of AI-generated code in Pull Requests. (And, no, I have no idea how they measure that. But I understand that tool use is being tracked. More tokens, nurse!)
The underlying threat didn’t need to be said out loud. “Use this technology more, or start looking for a new job.”
Developers are now incentivised to find reasons to use “AI” coding assistants, and they’re doing it at any cost. All other priorities rescinded. Crew expendable.
By now, we probably all know Goodhart’s Law:
When a measure becomes a target, it ceases to be a good measure.
I have a shorter version: be careful what you wish for.
The history of software development is littered with the bones of teams who were given incentives to adopt dysfunctional behaviour.
The classic “Lines of Code”, “Function Points”, “Velocity” and other easily gameable measures of “productivity” have forced thousands upon thousands of teams to take their eyes off the prize – i.e. business outcomes – and focus their efforts on producing more stuff – output.
Introducing mandates about how that stuff must be produced is a step up the dysfunction ladder.
So I had the privilege of watching a Java developer write the following prompt, which I jotted down for posterity.
Please extract the selected block of code into a new method called 'averageDailySales'
Using their IDE, that would have been just Ctrl+Alt+M and a method name. And, importantly, it would have worked first time. They ended up taking a second pass to fix the missing parameter the new method needed.
The whole 2-hour session was a masterclass in trying to cook a complete roast dinner in a sandwich toaster. The goal was very clearly not to solve the problem, but to use the tool.
I’m not saying that a tool like Claude Code or Cursor would add no value in the process. I’m saying that developers should be incentivised to use the right tool for the job.
But the “AI-first” mandate has encouraged some of the developers to drop all the other tools. They’ve gone 100% “AI”. No IDE in sight.
An Integrated Development Environment is a Swiss Army Knife of tools for viewing, navigating, manipulating (including refactoring), executing, debugging, profiling, inspecting, testing, version controlling and merging code. Well, the ones I use are, anyway.
Could IDEs be better? For sure. But when it comes to, for example, extracting a method, they are still my go-to. It’s usually much faster, and it’s much, much safer. I’ll take predictable over powerful any day.
Using refactoring as an example, if my IDE doesn’t have the automated refactoring I need – e.g., there’s no Move Instance Method in PyCharm – then I’ll let Claude have a crack at it, with my finger poised over the reset button.
Because my focus is on achieving better outcomes, I’ve necessarily landed on a hybrid approach that uses Claude when that makes sense – and, if you read my blog regularly, you’ll know I’m still exploring that – and uses my IDE or some boring old-fashioned deterministic command line tool when that makes sense. And, right now, that’s most of the time.
I feel no compulsion to drink exclusively from the firehose “just because”.
But then, I’m the only shareholder. And that’s probably what “AI-first” policies are really about: optics. There’s something about this that genuinely feels performative. It’s not about using “AI”, it’s about being seen to use “AI”. Look at us! We’re cutting edge!
There’s no credible evidence that “AI” ten-times’s dev team productivity. But there’s plenty of evidence that it can 10x a valuation.
The fact that, according to the more credible data, the technology slows most teams down – less reliable software gets delivered later and costs more – doesn’t seem to matter.
It’s quite revealing, if you think about it. Perhaps it never mattered?
I contracted in a London firm that would proudly announce in each year’s annual report how much they’d invested in technology. It didn’t seem to matter what return they got on that investment, just as long as they spent that £30 million on the latest “cool thing”.
When my team tried to engage with the business on real problems, the push-back came from the IT Director himself. That, apparently, was “not what we do here”. We’re here to chew bubblegum and spend money. And we’re all out of bubblegum.
So, in that sense, t’was ever thus. But, as with all things “AI” these days, it’s a question of scale. Watching team after team after team drop everything to try and tame the code-generating firehose, while real business and real user needs go unaddressed, is quite the spectacle. It’s a hyper-scaled dysfunction.
Of course, eventually, reality’s going to catch up with us. I was interviewed for a Financial Times newsletter, The AI Shift, a few weeks ago, and it was clear that the resetting of expectations has spread far beyond the the dev floor. People who aren’t software developers are starting to notice.
If, like me, you’re interested in what’s real and what works in developing software – with or without “AI” – you might want to visit my training and coaching site for details of courses and consulting in principles and practices that are proven to shorten lead times, improve reliability of releases and lower the cost of change.
I mean, if that’s your sort of thing.
And if you’re curious about what really seems to work when we’re using “AI” coding assistants, I’ve brain-dumped my learnings from nearly 3 years experimenting with and exploring the code-generating firehose. You might be surprised to hear that it has very little to do with code generation, and almost everything to do with the real bottlenecks in development.
Then again, you might not.