Engineering Leaders: Your AI Adoption Doesn’t Start With AI

In the past few months, I’ve been hearing from more and more teams that the use of AI coding tools is being strongly encouraged in their organisations.

I’ve also been hearing that this mandate often comes with high expectations about the productivity gains leaders expect this technology to bring. But this narrative is rapidly giving way to frustration when these gains fail to materialise.

The best data we have shows that a minority of development teams are reporting modest gains – in the order of 5%-15% – in outcomes like delivery lead times and throughput. The rest appear to be experiencing negative impacts, with lead times growing and the stability of releases getting worse.

The 2025 DevOps Research & Assessment State of AI-assisted Software Development report makes it clear that the teams reporting gains were already high-performing or elite by DORA’s classification, releasing frequently, with short lead times and with far fewer fires in production to put out.

As the report puts it, this is not about tools or technology – and certainly not about AI. It’s about the engineering capability of the team and the surrounding organisation.

It’s about the system.

Teams who design, test, review, refactor, merge and release in bigger batches are overwhelmed by what DORA describes as “downstream chaos” when AI code generation makes those batches even bigger. Queues and delays get longer, and more problems leak into releases.

Teams who design, test, review, refactor, merge and release continuously in small batches tend to get a boost from AI.

In this respect, the team’s ranking within those DORA performance classifications is a reasonably good predictor of the impact on outcomes when AI coding assistants are introduced.

The DORA website helpfully has a “quick check” diagnostic questionnaire that can give you a sense of where your team sits in their performance bands.

Image

(Answer as accurately as you can. Perception and aspiration aren’t capability.)

The overall result is usefully colour-coded. Red is bad, blue is good. Average is Meh. Yep, Meh is a colour.

Image

If your team’s overall performance is in the purple or red, AI code generation’s likely to make things worse.

If your team’s performance is comfortably in the blue, they may well get a little boost. (You can abandon any hopes of 2x, 5x or 10x productivity gains. At the level of team outcomes, that’s pure fiction.)

The upshot of all this is that before you even think about attaching a code-generating firehose to your development process, you need to make sure the team’s already performing at a blue level.

If they’re not, then they’ll need to shrink their batch sizes – take smaller steps, basically – and accelerate their design, test, review, refactor and merge feedback loops.

Before you adopt AI, you need to be AI-ready.

Many teams go in the opposite direction, tackling whole features in a single step – specifying everything, letting the AI generate all the code, testing it after-the-fact, reviewing the code in larger change-sets (“LGTM”), doing large-scale refactorings using AI, and integrating the whole shebang in one big bucketful of changes.

Heavy AI users like Microsoft and Amazon Web Services have kindly been giving us a large-scale demonstration of where that leads – more bugs, more outages, and significant reputational damage.

A smaller percentage of teams are learning that what worked well before AI works even better with it. Micro-iterative practices like Test-Driven Development, Continuous Integration, Continuous Inspection, and real refactoring (one small change at a time) are not just compatible with AI-assisted development, they’re essential for avoiding the “downstream chaos” DORA finds in the purple-to-red teams.

And while many focus on the automation aspects of Continuous Delivery – and a lot of automation is required to accelerate the feedback loops – by far the biggest barrier to pushing teams into the blue is skills.

Yes. SKILLS.

Skills that most developers, regardless of their level of experience, don’t have. The vast majority of developers have never even seen practices like TDD, refactoring and CI being performed for real.

That’s certainly because real practitioners are pretty rare, so they’re unlikely to bump into one. But much of this is because of their famously steep learning curves. TDD, for example, takes months of regular practice to to be able to use it on real production systems.

And, as someone who’s been practicing TDD and teaching it for more than 25 years, I know it requires ongoing mindful practice to maintain the habits that make it work. Use it or lose it!

An experienced guide can be incredibly valuable in that journey. It’s unrealistic to expect developers new to these practices to figure it all out for themselves.

Maybe you’re lucky to have some of the 1% of software developers – yes, it really is that few – who can actually do this stuff for real. Or even one of the 0.1% who has had a lot of experience helping developers learn them. (Just because they can do it, it doesn’t necessarily follow that they can teach it.)

This is why companies like mine exist. With high-quality training and mentoring from someone who not only has many thousands of hours of practice, but also thousands of hours of experience teaching these skills, the journey can be rapidly accelerated.

I made all the mistakes so that you don’t have to.

And now for the good news: when you build this development capability, the speed-ups in release cycles and lead times, while reliability actually improves, happen whether you’re using AI or not.

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

In this series, I’ve explored the principles and practices that teams seeing modest improvements in software development outcomes have been applying.

After more than four years since the first “AI” coding assistant, GitHub Copilot, appeared, the evidence is clear. Claims of teams achieving 2x, 5x, even 10x productivity gains simply don’t stand up to scrutiny. No shortage of anecdotal evidence, but not a shred of hard data. It seems when we measure it, the gains mysteriously disappear.

The real range, when it’s measured in terms of team outcomes like delivery lead time and release stability, is roughly 0.8x – 1.2x, with negative effects being substantially more common than positives.

And we know why. Faster cars != faster traffic. Gains in code generation, according to the latest DORA State of AI-Assisted Software Development report, are lost to “downstream chaos” for the majority of teams.

Coding never was the bottleneck in software development, and optimising a non-bottleneck in a system with real bottlenecks just makes those bottlenecks worse.

Far from boosting team productivity, for the majority of “AI” users, it’s actually slowing them down, while also negatively impacting product or system reliability and maintainability. They’re producing worse software, later.

Most of those teams won’t be aware that it’s happening, of course. They attached a code-generating firehose to their development plumbing, and while the business is asking why they’re not getting the power shower they were promised, most teams are measuring the water pressure coming out of the hose (lines of code, commits, Pull Requests) and not out of the shower (business outcomes), because those numbers look far more impressive.

The teams who are seeing improvements in lead times of 5%, 10%, 15%, without sacrificing reliability and without increasing the cost of change, are doing it the way they were always doing it:

  • Working in small batches, solving one problem at a time
  • Iterating rapidly, with continuous testing, code review, refactoring and integration
  • Architecting highly modular designs that localise the “blast radius” of changes
  • Organising around end-to-end outcomes instead of around role or technology specialisms
  • Working with high autonomy, making timely decisions on the ground instead of sending them up the chain of command

When I observe teams that fall into the “high-performing” and “elite” categories of the DORA capability classifications using tools like Claude Code and Cursor, I see feedback loops being tightened. Batch sizes get even smaller, quality gates get even narrower, iterations get even faster. They keep “AI” on a very tight leash, and that by itself could well account for the improvements in outcomes.

Meanwhile, the majority of teams are doing the opposite. They’re trying to specify large amounts of work in detail up-front. They’re leaving “AI agents” to chew through long tasks that have wide impact, generating or modifying hundreds or even thousands of lines of code while developers go to the proverbial pub.

And, of course, they test and inspect too late, applying too little rigour – “Looks good to me.” They put far too much trust in the technology, relying on “rules” and “guardrails” set out in Markdown files that we know LLMs will misinterpret and ignore randomly, barely keeping one hand on the wheel.

As far as I’ve seen, no team actually winning with the technology works like that. They’re keeping both hands firmly on the wheel. They’re doing the driving. As AI luminary Andrej Karpathy put it, “agentic” solutions built on top of LLMs just don’t work reliably enough today to leave them to get on with it.

It may be many years before they do. Statistical mechanics predicts it could well be never, with the order-of-magnitude improvement in accuracy needed to make them reliable enough (wrong 2% of the time instead of 20%) calculated to require 1020 times the compute to train. To do that on similar timescales to the hyperscale models of today would require Dyson Spheres (plural) to power it.

Any autonomous software developer – human or machine – requires Actual Intelligence: the ability to reason, to learn, to plan and to understand. There’s no reason to believe that any technology built using deep learning alone will ever be capable of those things, regardless of how plausibly they can mimic them, and no matter how big we scale them. LLMs are almost certainly a dead end for AGI.

For this reason I’ve resisted speculating about how good the technology might become in the future, even though the entire value proposition we see coming out of the frontier labs continues to be about future capabilities. The gold is always over the next hill, it seems.

Instead, I’ve focused my experiments and my learning on present-day reality. And the present-day reality that we’ll likely have to live with for a long time is that LLMs are unreliable narrators. End of. Any approach that doesn’t embrace this fact is doomed to fail.

That’s not to say, though, that there aren’t things we can do to reduce the “hallucinations” and confabulations, and therefore the downstream chaos.

LLMs perform well – are less unreliable – when we present them with problems that are well-represented in their training data. The errors they make are usually a product of going outside of their data distribution, presenting them with inputs that are too complex, too novel or too niche.

Ask them for one thing, in a common problem domain, and chances are much higher that they’ll get it right. Ask them for 10 things, or for something in the long-tail of sparse training examples, and we’re in “hallucination” territory.

Clarifying with examples (e.g., test cases) helps to minimise the semantic ambiguity of inputs, reducing the risk of misinterpretation, and this is especially helpful when the model’s working with code because the samples they’re trained on are paired with those kinds of examples. They give the LLM more to match on.

Contexts need to be small and specific to the current task. How small? Research suggests that the effective usable context sizes of even the frontier LLMs are orders of magnitude smaller than advertised. Going over 1,000 tokens is likely to produce errors, but even contexts as small as 100 tokens can produce problems.

Attention dilution, drift, “probability collapse” (play one at chess and you’ll see what I mean), and the famous “lost in the middle” effect make the odds of a model following all of the rules in your CLAUDE.md file, or all the requirements for a whole feature, vanishingly remote. They just can’t accurately pay attention to that many things.

But even if they could, trying to match on dozens of criteria simultaneously will inevitably send them out-of-distribution.

So the smart money focuses on one problem at a time and one rule at a time, working in rapid iterations, testing and inspecting after every step to ensure everything’s tickety-boo before committing the change (singular) and moving on to the next problem.

And when everything’s not tickety-boo – e.g., tests start failing – they do a hard reset and try again, perhaps breaking the task down into smaller, more in-distribution steps. Or, after the model’s failed 2-3 times, writing the code themselves to get themselves out of a “doom loop”.

There will be times – many times – when you’ll be writing or tweaking or fixing the code yourself. Over-relying on the tool is likely to cause your skills to atrophy, so it’s important to keep your hand in.

It will also be necessary to stay on top of the code. The risk, when code’s being created faster than we can understand it, is that a kind of “comprehension debt” will rapidly build up. When we have to edit the code ourselves, it’s going to take us significantly longer to understand it.

And, of course, it compounds the “looks good to me” problem with our own version of the Gell-Mann amnesia effect. Something I’ve heard often over the last 3 years is people saying “Well, it’s not good with <programming language they know well>, but it’s great at <programming language they barely know>”. The less we understand the output, the less we see the brown M&Ms in the bowl.

“Agentic” coding assistants are claimed to be able to break complex problems down, and plan and execute large pieces of work in smaller steps. Even if they can – and remember that LLMs don’t reason and don’t plan, they just produce plausible-looking reasoning and plausible-looking plans – that doesn’t mean we can hit “Play” and walk away to leave them to it. We still need to check the results at every step and be ready to grab the wheel when the model inevitably takes a wrong turn.

Many developers report how LLM accuracy falls of a cliff when tasked with making changes to code that lacks separation of concerns, and we know why this is too. Changing large modules with many dependencies brings a lot more code into play, which means the model has to work with a much larger context. And we’re out-of-distribution again.

The really interesting thing is that the teams DORA found were succeeding with “AI” were already working this way. Practices like Test-Driven Development, refactoring, modular design and Continuous Integration are highly compatible with working with “AI” coding assistants. Not just compatible, in fact – essential.

But we shouldn’t be surprised, really. Software development – with or without “AI” – is inherently uncertain. Is this really what the user needs? Will this architecture scale like we want? How do I use that new library? How do I make Java do this, that or the other?

It’s one unknown after another. Successful teams don’t let that uncertainty pile up, heaping speculation and assumption on top of speculation and assumption. They turn the cards over as they’re being dealt. Small steps, rapid feedback. Adapting to reality as it emerges.

Far from “changing the game”, probabilistic “AI” coding assistants have just added a new layer of uncertainty. Same game, different dice.

Those of us who’ve been promoting and teaching these skills for decades may have the last laugh, as more and more teams discover it really is the only effective way to drink from the firehose.

Skills like Test-Driven Development, refactoring, modular design and Continuous Integration don’t come with your Claude Code plan. You can’t buy them or install them like an “AI” coding assistant. They take time to learn – lots of time. Expert guidance from an experienced practitioner can expedite things and help you avoid the many pitfalls.

If you’re looking for training and coaching in the practices that are distinguishing the high-performing teams from the rest – with or without “AI” – visit my website.

Are You Training Your Junior Developers, Or Hazing Them?

One of the ways I feel lucky in my software development career is in how I got started.

I learned programming by building – well, trying to build – programs. Complete working programs – mostly simple games – on computers I had total control over.

I designed the games (remember graph paper?) I composed the music. I wrote the code. I tested the programs – perhaps not as thoroughly or as often as I should have. I copied the C30 cassettes. And I swapped them for other home-produced games on the playground. I even sold a couple.

I was the CEO, CTO, head of sales and marketing, product manager and lead developer, head of distribution, and QA manager of my own micro-tech company… that just happened not to make any real money, but that’s a minor detail.

I did it all, hands-on.

Then, after I stumbled out of university and needed money, I freelanced for a while, working directly with customers to understand their requirements, designing user interfaces, writing code, designing databases, testing the software – perhaps not as thoroughly or as often as I should have – packaging and deploying finished products, and answering the phone when users encountered a problem. Which they did. Often.

So I started my career as a full lifecycle software developer: requirements, design, programming, databases, testing, releases and operations.

Did I screw it up at times? Oh boy, yeah! But I learned fast. I had to to get paid. And, importantly, I got to see the whole process, work with a range of technologies, and wear a bunch of different hats.

And these were only mini projects. The world didn’t burn down when my SQL corrupted some data. The work was relatively low on risk, but high on learning. It built my competence and my confidence quickly.

When I got my first salaried job as a “software engineer”, I was then given what I would describe as a 2-year apprenticeship where I learned a lot of foundational stuff that would have been damned useful to know when I was freelancing.

And I was encouraged to try my hand at a wide range of things. My many screw-ups just never made it into any releases. The guardrails were very effective.

Importantly, while I was given a fairly free reign, I was closely supervised and mentored by developers with many years more experience. And I was given a lot of training.

Sadly, this is very different to how most developers start their careers these days. Instead of creating a wide range of learning opportunities on low-risk work, entry-level devs are confined to narrow, menial tasks – typically the ones “senior” developers would prefer not to do. “Training”, for too many, looks more like hazing.

It’s not at all uncommon for a junior dev to spend 6-12 months doing little else but fixing bugs on production systems, or working through SonarQube issues, or manning the support hotline. “It’s all they’re good for.”

New features? Product strategy? Talking to customers? Architecture? UX? Process improvement? The interesting stuff? That’s senior work.

Most often, it’s the risk that they’ll make mistakes that deters managers from giving junior developers too much freedom. But that’s a fundamental misjudgment. Mistakes and failure are integral to the learning process. The real risk is that you’ll grow developers who are afraid to try.

And while they’re painting the proverbial fences, it’s rare that they get much structured training or mentoring, either. Most organisations view senior developers as too valuable to “waste” on such things.

I see it differently. I think there comes a point in a developer’s career where the real waste is not letting them share their experience.

In this sense, there are “three ages” of a software developer, as their focus shifts from mostly learning, to mostly doing, to mostly teaching.

The job of a junior developer is to grow into a productive and well-rounded practitioner. And the productivity of a junior developer should be measured not by how much they deliver, but by how fast they grow. Month-on-month, year-on-year, what difference do we see in their capability and their confidence?

Businesses are so obsessed with cooking with the green tomatoes, they forget that with more time and more watering, they’ll grow into far more versatile red tomatoes.

Keeping them in a narrow lane, blinkered to the wider development process, stunts their growth. I’ve met many devs with decades of experience who were to all intents and purposes still junior developers.

When we frame it in those terms, the emphasis shifts from “What value can we extract from this junior dev today?” to “What potential can we add?”

In that light, it makes sense to structure their work around providing the most valuable learning opportunities. If they create tangible business value along the way, all the better. But that’s not the primary aim. The primary aim is to produce better software developers, and their work is a vehicle for that.

And if they somehow manage to burn the house down, that’s a you problem. How did their mistake make it into production?

Training & Mentoring is a Common Good

“Why should I train my developers? They’ll just leave.”

“Why is it so hard to find developers with the skills I need?”

Over a thirty three-year career, I’ve heard variations on these questions many, many times. And, typically, from the exact same people.

Many businesses are reluctant to invest in developers because tenures tend to be shorter than the time it takes for that investment to pay off. By the time a junior turns into a genuinely productive professional developer – someone who can work largely unsupervised and create more value than they cost – chances are they’ll be doing that somewhere else.

The mental leap employers don’t seem able to take is understanding that “somewhere else” is, from a previous employer’s perspective, them.

Where did their productive developers come from? Did they emerge fully-formed from a college or a school or a boot camp? Or are they products of years of learning – both on the job, from books, from courses, from each other, and so on – that somebody invested in?

Somebody took that hit. Whether they did it explicitly by paying for training and education or providing mentoring, or implicitly by shouldering the learning curve in everyday work, the only reason there’s any pasta sauce available at all is because somebody grew tomatoes.

I encourage employers to think of professional development in terms of paying it forward. The mindset that it should only be provided when there’s a direct – and often immediate – benefit has led to an industry of perpetual beginners.

Developers whose growth was stunted by a lack of investment in knowledge and skills set the example for inexperienced developers who are about to see their growth stunted in turn.

Because nobody sees it as their responsibility. It’s that patch of grass that doesn’t belong to anybody, so nobody cuts it, even though everybody complains about it.

Developers who are lucky enough to have lots of free time – usually young single men – may proactively hone their skills out of hours. But that’s not a strategy that scales to a $1.5 trillion-a-year profession. That would require more structure and more investment. A lot more. Imagine if your doctor was mostly self-taught in their spare time…

And it excludes a large part of the population who might well make great developers, but have young children, or care for elderly relatives, or volunteer in their communities, and can’t find the time to build skills that – let’s be honest now – are a bigger benefit for employers than for anybody else.

It shouldn’t be expected that the grass mows itself. By all means, if you can, then good for you. But the fact remains that a skilled software developer can be worth a lot of money to a business, and I don’t think it’s at all unreasonable to expect them to chip in.

I see developers as shared resources. Over their career, they’ll likely bring value to a bunch of different enterprises. It’s rare that we stay in the same place for our entire useful dev life.

In that sense, I see training and mentoring developers as a common good. And I believe that, in the long term, while developers should definitely own their own learning journey, employers should be expected to contribute to it while they’re together.

It should be a collaboration that hopefully brings benefit to both parties directly, but more importantly takes a long-term view and brings wider benefits to a whole range of organisations over their careers.

One benefit in particular is how developers who received proper long-term training and mentoring – it’s no secret that I’m an advocate of structured 3-5-year apprenticeships – can go on to pass their knowledge and skills on to people coming into the profession.

Frankly, if that was the norm, we might be looking at a very different industry. And, for employers, the ripples you start will eventually find their way back to your shores when you’re hiring.

“Why is it so hard to find developers who can do TDD, refactoring and Continuous Integration?” It’s because so few get expert training and mentoring in them. Invest in your developers, and build the capability to rapidly, reliably and sustainably evolve working software to meet rapidly-changing business needs.

To Build A High-Performing Team, You Need To Get Inside The Bubble

Some thoughts about the persistent 1:10:100 ratio of developers who are genuinely good, competent and “meh”…

Imagine building a team of 4 devs, and you want to tip the balance towards the 1%.

The odds of a team of 4 having 2 genuinely good developers are about 1:1,700.

The odds of that team having 3 genuinely good developers are 1:252,500.

The odds seem stacked against such teams existing. But that’s only if developers are selected from the pool at random. And that’s not how it usually works.

In the bubble of the 1% – which equates to about 4,500 developers in the UK, and 300,000 worldwide – people either know you, or they know someone who knows you. It’s a small-world network.

And in small-world networks, probabilities can be dramatically skewed. The odds of a randomly-selected developer being genuinely good may be 1:100, but the odds of a genuinely good developer knowing other genuinely good developers are pretty high.

So the odds of a 2/4 and even a 3/4 good team increase to the point where they’re quite probable indeed. If you find someone inside that bubble, you can much more easily build a high-performing team around them. Birds of a feather, and all that.

I occasionally get asked by founders to help them with their first developer hire. For all kinds of understandable reasons, they will often start out looking for a cheaper person because money’s tight. This typically rules out the 1% and the 10% and leaves them with “meh” options – inexperienced, or the “1 year’s experience 10 times” folks.

From many years working with software start-ups, I see this is a pivotal hire. Which band you go for – the 1, the 10 or the 100 – will likely determine the future trajectory of software development in the business: Good, Competent, or Meh.

Dev culture, once it takes root, is hard to shake. So you want that first hire to be setting a good example for future hires. Mentoring, in particular, is a large part of what good developers do, in my experience. I certainly wish, as an entry-level developer back in the Steam Age, that had been my first professional experience. I can’t help feeling that’s how it should work at entry level.

The code base also sets the tone. If you start out with a Big Ball of Mud, it takes much more effort to climb out of it. But also… Monkey see, monkey do. Less experienced devs will tend to imitate the style, with nobody there to tell them “This isn’t how it should be done”. (They may even be telling them “This is how it’s done!”)

But most valuable of all, a good developer will very likely know – or be able to reach – other good developers, raising your odds of building a high-performing team by orders of magnitude.

Why would a founder want a high-performing team? Their calling card is short delivery lead times and reliable releases, and the ability to sustain the pace on the same product for as long as that product’s a going concern.

They can rapidly, reliably and sustainably evolve software to meet rapidly changing needs.

Which is nice.

What Is A “Good Software Developer”, Anyway?

This is something that’s on my mind a lot, as a trainer and coach. For sure, it’s highly subjective, and depends very much on the context in which they’re working.

So I’m going to talk about what I’m looking for when clients ask me to find them a “good software developer”.

Top of the list, is an expert command of Rust.

I’m kidding, of course.

Top of the list, invariably, is communication skills. The best developers work to understand and be understood clearly and effectively. So much of what goes wrong in software development can be traced back to poor (or no) communication. Written, verbal, aural. They have good comprehension, they can articulate complex ideas simply to different audiences, and – most importantly – they actually communicate. I’ve known some great communicators who did everything they could to avoid having to do it.

Closely related, I’m also interested in their empathy and emotional intelligence. More on that later.

In terms of technical skills and knowledge, I look for someone who is near enough to what the team needs in terms of problem domain, programming languages, tech stacks, tools and so on. The question is, how long might it take them to get up to speed?

If they’re a C# developer with other good qualities in abundance, it’s probably worth investing a few weeks letting them wrap their head around Java. If they have no physics background, and you’re working on software for particle accelerators, it could be years before they’ve wrapped their head around it.

This is less about whether they’re a “good developer”, but rather whether they’re good enough for the problem they’ll be working on. So it’s about fit – are they right for the part?

Next – and this might surprise you – is refactoring. It’s such an important skill regardless of what your development approach is. The ability to reshape code to accommodate changes without breaking it is worth its weight in gold. It also requires a fairly advanced ability to reason about code and about design.

Long-form refactoring, in particular, hints at a maturity as a programmer and software designer that’s sadly hard to find. If I hired someone who hadn’t been introduced to refactoring yet, training them would be the next step.

Having interviewed hundreds of developers, I know that good refactoring skills are a “tell”. They’re a good omen about the candidate’s technical skills more generally.

More generally, someone who works in tight feedback loops, testing and reviewing code continuously, committing often, and integrating on the trunk many times a day, will tend to generate better outcomes: shorter lead times, more reliable releases, and a lower cost of changing the software. (It never ceases to surprise me when employers say that doesn’t interest them much.)

The best developers I’ve seen have been able to stand back and see the bigger picture and how their work fits into it. As I sometimes say, they see the arrows, not just the boxes. They’re not like those actors who only read their lines in the script, or think about their scenes, and take no interest in the story or the other characters.

They’re effective operating at multiple levels, taking an interest not just in their code, but in what other devs on the team are doing – watch them pull changes from the repo, do they read them? – and in what other teams working on connected stuff are doing, and what’s happening in UX design, product management, enterprise architecture and all the other stuff going on around them.

In this sense, they’re also pretty situationally aware.

My version of a “good developer” is focused on achieving outcomes, not on producing output. They aim to create value, not just bash out features or close tickets. And in that sense, they take an active interest in understanding the problems they’re trying to solve. They see themselves as part of a wider team that’s not just delivering software, but solving business problems.

The really good ones play a part in steering the ship towards better outcomes and greater value. They’re not just waiters taking orders. They’re helping to design the menu.

The developers I recommend to clients can, and often have, worn multiple hats in their careers. They’re part programmer, part tester, part architect, part product manager, part UX designer, and other things. I would say they are “T-shaped” developers.

This is especially important if we want to avoid over-specialisation on the team, where experts become potential bottlenecks. It also helps enormously when developers have walked a mile in other people’s shoes when it comes to understanding our impact on the wider process. I could argue that if more us were “T-shaped”, fewer of us would believe that “AI” code generators are increasing productivity. We’d know there’s a lot more to it than that.

And they keenly understand that the unit of value creation is the team. They bring value holistically, considering the impact of their work on team members and team outcomes, as well as looking for ways to help others on the team add more value.

As Dan North recently pointed out, some of the most valuable software developers may not even show up under their own name in your Jira or GitHub stats. (I’ve worked with Tim Mackinnon, and I can confirm that he is an exceptionally good software developer.)

One of the ways that good software developers can add value is by mentoring other developers on their team. They may well be at a point where they bring the most value by growing more good software developers. So, I look for significant experience of that.

Again, this is my take, but no matter how technically gifted you are, if you sniff at the prospect of mentoring, I’d think twice about recommending you to clients who need and expect that.

And finally, they need to catch on quick. Software development’s a learning process, and the ability to learn is kind of critical. This isn’t just about being smart. It’s also about being open and willing to learn, to be out of your comfort zone. And to be prepared to fail. Every skill mastered involves going through a stage of being bad at it. It helps a lot if the culture of the team offers them the psychological safety to try and to fail.

Some developers bring that psychological safety with them. I’ve had candidates literally try e.g., TDD, or a new programming language, in the interview. Someone who’s prepared to have a crack at something, even in those kinds of pressured situations, has usually turned out to be a real asset to the team.

In interviews, I’m often picking up on the candidate’s level of risk aversion. Will they be willing to take the occasional leap into the unknown? Will they be prepared to question the status quo? Or will they be focusing their efforts on covering their own backsides and protecting the hierarchy?

I appreciate this is usually a learned behaviour. A developer who has experienced serious consequences for failure is less likely to be willing to take those leaps. And we can all try to do what we can to bring a bigger sense of adventure out of people by being supportive, and by developing systems that minimise potential consequences. (I mean, exactly how does a junior developer get access to a live database? The failure there is ours.)

The fact remains, though, that a risk-averse developer, more afraid of failing, is less likely to try, to ask “dumb” questions, and therefore less likely to learn. It stunts our growth.

This is why, when I consider the team as a whole, it’s important for developers – especially senior developers – to have enough emotional intelligence to recognise when they are the ones holding others back by mocking people when they’re trying, or punishing them when they fail. It’s something we all have to work on.

To sum up, when I’m looking for a “good developer”, I’m looking for:

  • A good communicator
  • Someone technically near enough
  • Good refactoring skills (ideally, long-form)
  • Someone who solves one problem at a time, in tight feedback loops
  • Someone who sees the bigger picture
  • Someone who is situationally aware of what’s going on around them
  • Someone who is outcome-oriented
  • Someone who can wear multiple hats
  • Someone who adds value to the team
  • Someone who catches on fast, and isn’t afraid to try new things
  • Someone with a decent amount of emotional intelligence and empathy

“Our senior developers already know this stuff, Jason?”

I hear this very often from managers who’ll invest in training entry-level developers, but only entry-level.

Do they, though?

A large-scale study of developer activity in the IDE found that, of the devs who said they did TDD, only 8% were doing anything even close in reality. Most didn’t even run their tests, let alone drive their designs from them and run them continuously. Developers checking in code they haven’t even seen compile is more common than you might think.

They may well believe that they’re doing them, of course. They learned it from someone (who learned it from someone) who learned it from e.g. a YouTube tutorial made by someone who’s evidently never actually seen it being done. (I check every year – there’s a LOT of those, and they get a LOT of views.)

After all this time working with so many different teams in a wide range of problem domains, I can tell you de facto that the practices developers claim they’re doing – TDD, refactoring, “clean code”, CI & CD, modular design etc – usually aren’t being practiced much at all. That’s the norm, I’m afraid.

Unsurprisingly, the employer therefore sees none of the benefits in shrinking lead times, more reliable releases, and a more sustainable pace of delivery. The work remains mostly fighting fires and heroics around every deployment, rapidly eating up your budget on frantic motion-without-progress.

(And now we’re seeing that being amplified by you-know-what!)

Turns out you can’t just say you’re doing it. You have to ACTUALLY DO IT to get the benefits.

The way it often plays out is:

– You send your new hires to Jason (or someone like Jason). Jason teaches them some good habits that we’ve seen over the decades are likely to reduce delivery lead times, improve release stability and lower the cost of change.

– New hires go back to their teams, where – day-in and day-out – they see senior colleagues setting a bad example, and being rewarded for heroically putting out the fires they started.

– They may resolve to find themselves a job where they’ll get to apply what they’ve learned, and not feel pressured to just hack things out like everybody else.

– But, more commonly, they’ll just give up and go with the flow. Or, more accurately, the lack of it. Their careers take the path most-travelled, and you continue to wonder why it’s so hard to find senior developers who can do this stuff.

I would urge you to consider this when deciding who needs training and mentoring. I appreciate, it’s a touchy subject with folks who claim they’re already doing these things. But there are ways you can broach it: a “refresher”, “mentoring the juniors”, etc etc.

It really helps to align teams, and make the learning more “sticky” in day-to-day work. Otherwise, there’s a very real chance your junior developers will be un-taught by their senior peers.

And then, like I said, you don’t get the benefits – just the fires.

TDD Under The Microscope #1 – Usage-Driven Design

This is the first part in a series of posts where I’m going to try to crystallise my ideas about how TDD really works, ideally expressed pseudo-formally (with pseudocode).

Workflow Linting. Like Code Linting, Only The Same.

It’s part of an ongoing side-project to create what I’ve provisionally titled a “workflow linter”. Many of us are familiar with code linters. They walk the abstract syntax tree of our code, applying rules to each node depending on what type of code element it is – rules that apply to classes, rules that apply to methods, rules that apply to parameters, and so on.

In a very real sense, programming languages can be described as Finite State Machines, and some parsers are even event-driven. Instead of building an abstract syntax tree, the rules of the language are applied as one language element transitions to the next: e.g., the reserved word “public” can only be followed by certain allowed elements, like a type name, or “void” or “static” etc.

The same idea can be applied to workflow. Boiling the kettle can be followed by pouring the water into a teapot. We do not pour the water and then boil it, just like we do not write “void public”.

Workflows As Finite State Machines vs. The Real World

Modeling workflows as FSMs can be helpful to visualise them, and that’s traditionally how processes like Test-Driven Development have been described.

Image

A large part of my job as a trainer and mentor could be described as “workflow linting”. I observe developers doing TDD or refactoring or Continuous Integration, and I have an idea in my head of what they should be doing at any given stage in that workflow. So when I see someone start to write the code to pass the test, but we haven’t seen the test fail yet, a little light goes ping in my head, and I intervene.

And if our goal is behaviour modification – habit forming, basically – then the best time to give feedback is as it’s happening. This is one of the reasons why organisations who rely mostly on after-the-fact feedback – Pull Request code reviews, weekly or monthly appraisals etc – tend to be places where developers learn the slowest. Junior developers who are left mostly working on their own tend to remain junior developers for a long time. Sometimes forever.

So it’s important when we’re mentoring developers on specific processes to have a clear internal mental model of those workflows – the light that goes “ping” when they deviate.

The problem with internal mental models is they don’t transmit easily. So we need to express them somehow in order that we can teach them (to people, to computers, to squirrels who need to know when they shouldn’t distract us, and so on.)

In the real world, though, things are messy and noisy. Maybe I boiled the kettle but forgot to put the tea in the pot, so I have to do that next. Or maybe I got distracted by a squirrel in the garden, lost track of time and had to boil the kettle again. Maybe I don’t even have a kettle, and need to boil the water in a pan.

The stream of events created by our IDEs as we work is equally messy and noisy. Maybe I wrote the failing test but when I ran it, it actually passed first time. Maybe I decided to rewrite the assertion. Maybe I deleted the test without running it and started again. Maybe I got distracted by a squirrel in the garden. Etc.

Declarative Implied Workflow

Hardwired imperative workflows of the “A  → B  → C” variety can be brittle, and don’t handle mess and noise well. You’re left having to map every possible edge case, every allowable scenario. Things can get very complicated very quickly. And no matter how much we try to think of everything, we invariably miss cases. “Computer says ‘no’.”

A more flexible and robust way to describe workflow is to imply ordering of events declaratively*. Instead of “Boil the kettle, then pour the water into the teapot”, we might say “When the water is poured into the teapot, it must be at near boiling”. Now all of our edge cases work. We could have boiled it in a pan. We could have boiled it in a microwave. We could have watched the squirrel and then boiled the kettle again. Workflow is implied: the water has been boiled, but we leave ourselves many more routes to pouring it in the teapot.

A similar approach can be taken with workflows like TDD; instead of “Run the test to see it fail, then write the simplest code to pass it”, we could say “When we write the code to pass the test, it must be failing”. It’s a subtle but important distinction, and one which I’ve realised I’ve been using for a long time when I observe developers working.

On training courses, in particular, I’m popping my head around the proverbial door while pairs are in the middle of something. Subconsciously, I have to establish where they are in the process without seeing them follow the workflow. If they’re writing implementation code and I see a test failing that appears to be related, I can deduce that they’re writing code to pass that test. If no tests are failing, and they appear to be adding behaviour, then – without seeing them do the whole “A  → B  → C” – the little light goes “ping” and the “workflow linter” reports an error.

(The limitation of this meat-based workflow linter is availability. If there are six pairs doing a 90-minute exercise, they get maybe 15 minutes with the benefit of the linter. That might be 1-2 TDD cycles. This, by the way, is why follow-up coaching can be such a good investment. Hence this ongoing side-project to see if I can create an IDE plugin that can do a bit more than my t-shirt that says “NOW RUN YOUR TESTS”.)

Usage-Driven Design Formalised Declaratively

Anyhoo, back to the main feature.

UDD is a core component of TDD. Arguably, it’s the whole point of it; driving our implementation design by working backwards from how it’s used in tests.

Informally, it goes like this: you use it in a test, and that tells you that you need it. Usage comes first. Practically, I don’t declare a ShoppingBasket class:

public class ShoppingBasket {
}

And then instantiate it in a test:

@Test
void totalOfEmptyBasket(){
   ShoppingBasket basket = new ShoppingBasket();

I instantiate it in my test, and when the compiler tells me that class doesn’t exist, that tells me that I need to declare it. Use it, and that tells you that you need it. It always flows in that direction.

Formalising this rule declaratively is actually pretty straightforward; any time I declare something, it must either be a test (a test method or a test class), or it must already have a reference – it must already be used.

In pseudocode, something like:

declaration.isTest || declaration.references.count > 0

Applying this rule when a declaration is added to the code’s syntax tree should catch anyone declaring things that are not used in a test, or used by something that’s used in a test – basically, nothing can exist that isn’t in the call stack of a test except a test.

I might even go so far as to proclaim this as a formal definition of the “test-driven” part of TDD. Nothing can exist unless a test is directly or indirectly using it. Or, as I tell customers, “If there isn’t a test for it, you ain’t getting it”.

Of course, this rule has to be adapted to work in Java, with JUnit, inside, say, IntelliJ’s or Eclipse’s event model, so translation is required. But I have a version of this in an IntelliJ plugin, and it does appear to work, and be mess and noise resistant.

* Formal Methods folks might think this is a bit like temporal logic. It’ll be our little secret.

The Obligatory Post About A Profession of Software Development

When I moan about the immature state of our now 80 year-old profession, some folks will nod along, some will argue that *all* professions have incompetent practitioners (medicine is often cited as an example), and some will say that there’s no widely-agreed-upon body of knowledge on which to build a mature profession.

I’m not sold on the notion that, say, the medical profession is just as bad. For sure, there are bad doctors. But it’s a question of degrees. Are 90% of them incompetent? I’d be surprised to meet a doctor who’d never heard of sterilising instruments.

But I meet the equivalent software developers all the time, who have somehow missed out on some pretty fundamental stuff. I consider continuous testing to be “foundational”, for example: the equivalent of sterilising our instruments. We really shouldn’t be operating on production code without it. And yet it remains stubbornly a minority pursuit, despite all the evidence that it tends to produce better outcomes for our patients 🙂

I’m also not sold on the argument that there’s no consensus on what works and what doesn’t in software development. When it comes to the foundational stuff, the jury is very much /not/ out. While you can always find people who will disagree that, e.g., iterative and incremental delivery or user-centred design are good things generally, the fact is that they’ve enjoyed a majority consensus for decades.

What we would recognise as modern software development has been established since the late 1980s. Not much has been added to that body of knowledge since (though there’s been plenty of “churnovation” – variations on those themes – and, frankly, our computers just got a *lot* faster, enabling us to turn the dials up way past 11).

It’s some of these core foundations that teams learn on my training courses (check out the Codemanship YouTube channel for oodles of free tutorials, BTW).

So I believe that we are somewhat exceptional in being a profession of perpetual beginners, for a variety of reasons, and I also believe that there is a consensus, backed by a significant body of data (e.g., DORA), on what – in general terms, at least – tends to produce the best outcomes for our customers.

And I can’t help feeling that our profession could organise itself better to ensure that fewer developers can work for years and years without being exposed to these fundamentals.

What Is ‘Leadership’ Anyway?

If you spend any time on LinkedIn you’re likely to bump into content about this thing called “leadership”. Many posters fancy themselves as experts in this mysterious quality. Many promote themselves as professional “leaders”.

I’m sure you won’t be surprised to learn that I think this is nonsense. And now I’m going to tell you why.

Leading Is Not What You Think It Is?

Let’s think of what that word means: “Lead the way”, “Follow my lead”, “Tonight’s leading news story”, “Mo Farah is in the lead”.

When you lead, it usually means that you go first.

Leading is distinctly different from commanding or inspiring, but that’s what many professional “leaders” mistake it for.

Leaders don’t tell people where to go. They show people the way by going first.

I don’t tell people to write their tests first. I write my tests first and show them how. I lead by example.

‘Leader’ Is Not A Job Title

Organisations appoint what they believe to be good leaders into roles where leading by example is difficult, if not impossible. They give them titles like “Head of” and “Director of” and “Chief” and then promote them away from any activity where they would have the time to show rather than tell.

The real leaders are still on the shop floor. It’s the only place they can lead from.

And, as we’ve probably all experienced, promoting the people who could set the best example into roles where they can’t show instead of tell is a very common anti-pattern.

We Are Not Your Flock

Another common mistake is to see leadership as some kind of pastoral care. Now, I’m not going to suggest that organisations shouldn’t take an interest in the welfare of their people. Not just because happy workers make better workers, but because they are people, and therefore it’s the right thing to do.

And executives could set examples – like work-life balance, like the way they treat people at all levels of the corporate ladder, and like how much they pay people (yeah, I’m looking at you, gig economy) – but that’s different to the way many of them perceive that role.

Often, they’re more like religious leaders, espousing principles for their followers to live by, while indulging in drug-fuelled orgies and embezzling the church’s coffers.

And the care that most people need at work is simply to not make their lives worse. If you let them, grown-ups will grown-up. They can buy their own massage chair if they want one. Nothing more disheartening than watching managers impose their ideas about well-being on to actual adults who are allowed to drink and drive and vote.

If people are having problems, and need help and understanding, then be there for that. Don’t make me go to paintball. I don’t need it, thanks.

The Big Bucks

Most developers I know who moved into those “leadership” roles knew it was a mistake at the time – for the organisation and for themselves – but they took the promotion anyway. Because “leadership” is where the big bucks are.

The average UK salary for a CTO is £85,000. For a senior developer, it’s £60,000 (source: itjobswatch.co.uk). But how senior is “senior”? I’m quite a senior developer. Most CTOs are junior by comparison.

And in most cases, CTO is a strategic command – not a leadership – role (something I freely admit I suck at). A CTO cannot lead in the way I can, because I set an example for a living. For all I know, there are teams out there I’ve never even met who’ve been influenced more by me than by their CTO.

‘Leader’ Is A Relative Term

When I’ve been put in charge of development teams, I make a point of not asking developers to do anything I’m not prepared to at least try myself, and this means I’m having to learn new things all the time. Often I’m out of my comfort zone, and in those instances I need leadership. I need someone to show me the way.

Leadership is a relationship, not a role. It’s relative. When I follow you, and do as you do, then you are the leader. When you do as I do, I’m the leader.

In the course of our working day, we may lead, and we may follow. When we’re inexperienced, we may follow more than we lead. But every time you’ve shown someone how you do something and they’ve started to do it too, you’re a leader.

Yes, I know. That sounds like teaching. Funny, that.

But it doesn’t have to be an explicit teacher-student relationship. Every time you read someone’s code and think “Oh, that’s cool. I’m going to try that”, you have been led.

It’s lonely at the top

For sure, there are many ways a CxO could lead by example – by working reasonable hours, by not answering emails or taking calls on holidays, by putting their trust in their people, or by treating everyone with respect. That’s a rare (and beautiful) thing. But it’s the nature of heirarchies that those kinds of people tend not to get ahead. And it’s very difficult to lead by example from a higher strata. If a CTO leaves the office at 5:30pm, but none of her 5,000 employees actually sees it, does it make a sound?

Show, Don’t Tell

So, leadership is a very distinct thing from command. When you tell someone to do something, you’re commanding. When you show them how you do it – when you go first – that’s leading.

“Show, don’t tell” would be – if it had one – Codemanship’s mission statement. Right from the start, I’ve made a point of demonstrating – and not presenting – ideas. The PowerPoint content of Codemanship training courses has diminished to the point of almost non-existent over the last 12 years.

And in that sense, I started Codemanship to provide a kind of leadership: the kind a CTO or VP of Engineering can’t.

Set Your Leaders Free

I come across so many organisations who lack technical leadership. Usually this happens because of the first mistake – the original sin, if you like – of promoting the people who could be setting a good example into roles where they no longer can, and then compounding that mistake by stripping authority and autonomy from people below that pay grade – because “Well, that’s leadership taken care of”.

I provide a surrogate technical leadership service that shouldn’t need to exist. I’m the CTO who never took that promotion and has time – and up-to-date skills – to show you how to refactor a switch statement. I know people who market themselves as an “Interim CTO”. Well, I’m the Interim Old Programmer Who’s Been Around Forever.

I set myself free by taking an alternative career path – starting my own company. I provide the workshops and the brown bag sessions and the mobbing sessions and the screencasts and the blog posts that you could be creating and sharing within your organisation, if only they’d let you.

If only they’d trust you: trust you to manage your own time and organise things the way you think will work best – not just for getting things done, but for learning how to do them better.

People working in silos, keeping their heads down, is antithetical to effective leadership. Good ideas tend to stay in their silos. And my long experience has taught me that broadcasting these ideas from on-high simply changes nothing.

Oh, The Irony

I believe this is a pretty fundamental dysfunction in organisational life. We don’t just have this problem in tech: we see it repeated in pretty much every industry.

Is there a cure? I believe so, and I’ve seen and been involved with companies who’ve managed to open up the idea of leadership and give their people the trust and the autonomy (and the resources) to eventually provide their own internal technical leadership that is self-sustaining.

But they are – if I’m being honest – in the minority. Training and mentoring from someone like me is more likely to lead to your newly inspired, more highly skilled people moving on to a company where they do get trust and autonomy.

This is why I warn clients that “If you water the plant, eventually it’ll need a bigger pot”. And if pushed to describe what I do, I tell them “I train developers for their next job”. Would that it were not so, but I have no control over that.

Because I’m not in charge.