Engineering with AI

AI makes your team faster. It may also be quietly overwhelming the people responsible for making sure it all holds together.

Two years ago, something shifted. The developers I work with stopped staring at blank files waiting for inspiration. They started shipping. Fast. A feature that once took a week arrived on Friday afternoon in a pull request. A proof-of-concept that used to require a research spike was running in a sandbox by lunchtime. On the surface, it looked like a productivity miracle. And in many ways, it was.

But somewhere in the middle of all that acceleration, I started feeling something I didn't expect: exhausted. Not from doing less work — from doing a fundamentally different kind of work that nobody had really prepared me for. I was no longer just a senior engineer. I had quietly become the human quality gate at the end of an AI-powered production line.

This is a post about that experience — and a reflection on what it means to lead an engineering team in the age of AI-generated code, AI-generated documentation, and AI-accelerated expectations.

The Acceleration Is Real. So Is the Bottleneck.

There's no point pretending the tools aren't remarkable. I've watched a non-technical executive describe a rough idea in plain English and receive a working, deployable application within 48 hours. I've used AI myself to cut research time from hours to minutes — skipping through documentation, comparing architectural options, generating functional scripts I could validate immediately.

The productivity gains are real, measurable, and genuinely exciting.

But here's what we don't talk about enough: when five developers on a team can each generate production-quality code in a matter of hours, the bottleneck doesn't disappear. It moves. It moves to the one person who has to review all of it, understand all of it, and be accountable for all of it — the lead engineer.

"I used to occasionally receive a large pull request. Now I regularly receive pull requests with more than 10,000 lines of code. That's not a small shift — it's a different job."

Ten thousand lines. Generated in a day or two. Each line potentially containing a perfectly reasonable decision that doesn't fit the codebase, the organisation's standards, or the architectural direction we've been building toward for months.

The Problem Isn't the Code. It's the Context.

When I review AI-generated code, the issue is almost never that the code is wrong. It often works. Sometimes it's elegant. The problem is that it works in a vacuum — without awareness of the larger system it's being dropped into.

A Tale of Two Auth Flows

Real scenario: A developer is building a new service that requires user authentication. They prompt their AI tool: "Implement OAuth 2.0 authentication." The AI, being genuinely helpful, implements a secure, well-structured OAuth 2.0 flow. It chooses the Client Credentials flow — reasonable for certain service-to-service contexts.

The problem: every other application in the organisation uses the Authorization Code + PKCE flow. The new service is technically secure. But it doesn't fit. It can't be managed the same way. It creates a maintenance burden, an onboarding headache, and a future security audit question.

None of this is visible in the code itself. You only know it's wrong if you know the organisation.

The same pattern repeats across every layer of a codebase: pagination strategies (cursor-based vs. page-based), API versioning conventions, logging formats, error response shapes. An AI tool has no way to know what your organisation decided in a design review six months ago. It makes a reasonable choice. The lead engineer has to catch it, explain it, and redirect it — across dozens of pull requests, week after week.

Over-Engineering by Default

There's another pattern I've noticed: AI tools tend to gold-plate. Ask for a simple data endpoint and you may receive a fully instrumented, horizontally scalable service with a caching layer, a retry queue, and an event-driven architecture. These are often impressive. They're also sometimes completely unnecessary.

I've seen caching layers added to services that receive ten requests a day. I've seen event-driven patterns introduced into workflows that run once a week. The AI isn't wrong, exactly — these patterns are useful in the right context. But context is precisely what AI doesn't have. And the lead engineer has to not only identify the over-engineering, but explain why simplicity is the right call — a harder conversation than it sounds when the code in front of you looks sophisticated and well-intentioned.

Junior Engineers, Design Patterns, and the Invisible Gap

One of the trickier dynamics I've observed is the gap between what junior engineers ask for and what the codebase actually needs.

Senior engineers develop pattern instincts over years of reading code, making mistakes, and reviewing others' work. They know when to reach for a Strategy pattern versus a Factory. They recognise when a new module is inadvertently duplicating an abstraction that already exists three directories away. They feel the shape of a codebase.

Junior engineers — even talented, hard-working ones — haven't had time to build that instinct yet. And now they have a tool that can generate a thousand lines of working code from a single prompt. The code may function perfectly and still be architecturally alien to the rest of the project.

The cognitive load here is subtle but real: as a reviewer, I have to hold the existing architecture in my head, understand what the AI has generated, identify where they diverge, and then explain the divergence in a way that's instructive rather than demoralising. Every time. For every PR. On top of everything else.

"The bottleneck used to be 'can we build it?' Now the bottleneck is 'can we integrate it?' — and that second question is entirely on the human."

The Documentation Deluge

The challenge isn't confined to code. It extends upstream — into the documents, diagrams, and design artefacts that define how we build things before a single line of code is written.

AI tools are extraordinarily good at generating documentation. A single prompt can produce a 15–20 page architecture document, complete with database schema breakdowns, relationship diagrams, data flow descriptions, and edge case analyses. This is genuinely useful. It's also genuinely overwhelming.

The problem is that generating a document is not the same as understanding it. When I hand a stakeholder a comprehensive 20-page requirements document, they reasonably expect me to be able to speak to every page of it. That means I need to have read it carefully, verified it against our actual constraints, and caught any places where the AI has been confidently wrong — which happens more than I'd like to admit.

The Revision Loop

You generate a 20-page document. You identify sections that need changing. You prompt again with your revisions. The new document comes back. You read all 20 pages again to verify the changes were applied correctly — and that nothing else shifted in the process. Sometimes the AI has quietly reintroduced the very content you asked it to remove, just rephrased slightly. You catch it, or you don't.

Multiply this loop by two or three documents a day, every day, and you start to understand the weight of it.

There's also a subtler issue: stakeholders and senior leadership are using AI too. They can now arrive at meetings with polished documents, detailed proposals, and rapid-fire ideas that used to require days of preparation. The pace of conversation has accelerated. The expectation of response time has compressed. Everyone can have a fully articulated perspective on everything, almost instantly.

The lead engineer in the middle of this — responsible for translating between leadership's AI-accelerated vision and the development team's AI-accelerated execution — is now doing a kind of cognitive dance that simply didn't exist before. And it's tiring in ways that are hard to quantify.

What's Actually Missing: Governance at the Prompt Layer

The honest answer is that we don't yet have good tools for governing how AI is used within a development team. We can add linting rules, we can write coding standards documents, we can include architectural decision records in our repositories. But none of these things reach back to the moment when a developer is sitting at their keyboard, typing a prompt, and an AI tool is making a dozen invisible architectural micro-decisions.

Some teams are experimenting with prompt templates, shared context files, and AI coding guidelines — and these help at the edges. But for genuinely novel problems, for the parts of a system where there's no prior art to reference, the AI still has to make a call. And someone human still has to evaluate that call after the fact.

Right now, that someone is almost always the lead engineer.

A Note to Other Leads in the Same Position

If you're a senior engineer or engineering lead and any of this feels familiar, I want to say: you're not imagining it. The cognitive load is real. It's structural, not personal. It's a consequence of a genuine and rapid shift in how software gets built — and the tools, processes, and norms for managing that shift at a human level are still catching up.

The goal of this post isn't to argue against AI tools. They've made me faster, made my team more capable, and opened up possibilities that didn't exist two years ago. The goal is to name what's happening honestly — because the first step to solving a problem is admitting it exists.

We need to have a serious conversation with our leaders about what it means to lead a team when the output volume has multiplied but the human review bandwidth hasn't. About how we set expectations with stakeholders who are also using AI and have lost their intuition for how long things actually take. About how we invest in the judgment and context-building that no AI tool can replace.

The bottleneck has moved. Now we need to figure out how to move with it.

💡

In the past as a part of this series, I wanted to write about AI without using AI tools but this blog that I generated using Apple Intelligence on my Phone and Claude Opus 4.6. This content echoes my inner thoughts very well and I thought it would be great to share it with everyone. I create a Apple voice memo where I spoke for 16 mins describing what are the challenges I face. Then when I saw the Apple Intelligence transcription, I thought it would be a good idea to ask Claude to write a blog for me. I originally intended to get ideas, but the content turned out so good, I decided to publish it without any changes.

Engineering with AI - The Hidden Tax

The Acceleration Is Real. So Is the Bottleneck.