Insights | The realfast view

Velocity isn't lines per day. It's knowing which lines matter.

The entire production diff was about 15,000 lines — roughly thirty lines per engineer per day. AI's real contribution wasn't writing them. It was comprehension, not codegen.

An F1 race car at maximum velocity
Illustration: realfast

Jun 30th 2026 | 6 min read


TL;DR

The entire production diff was about 15,000 lines: roughly thirty lines of code per engineer per day.

AI’s real contribution wasn’t writing those lines. It was understanding a vast, decades-old system fast enough that the few we wrote were the right ones. Comprehension, not codegen.

Our client was a corporate-services SaaS company, north of $150M in revenue, with a platform a whole market of firms uses for compliance and payments. They wanted to extend their platform into a new geography.

This was much more complex and broad-ranging than just a feature. It meant extending the platform into a whole new market, complete with payments, compliance, reporting, and a dozen interconnected modules built and rebuilt over more than a decade.

We shipped, on time, into production.

And then we realized something interesting about the role AI played in the project.

Thirty lines a day

As we were merging everything to production, we looked at the diff. The whole engagement — every module, every fix, every line we touched — came to roughly 15,000 lines of code.

Across a pod of eight realfast engineers, that’s thirty lines of code a day. Per person.

We delivered with our signature speed, executing in three months what was originally estimated to be a nine-month project.

But still. Thirty lines of code per person, per day.

Plus, this wasn’t a team working with one hand tied behind its back. We ran the whole engagement on realfast’s AI-human hybrid approach: the best frontier models, our own infrastructure, the works.

Thirty lines per day was the output with all of it going full tilt.

Now, before anyone reaches for the keyboard: lines of code is a terrible measure of effort. The valence in a line of code varies across businesses, across stacks, across how mature and how gnarly the product is. A line in a greenfield React app, and a line in a twelve-year-old payments module are not the same unit.

We all know this. And yet. Thirty??

So what were we doing for three months?

The number nagged at us. Right now, almost everyone believes the same thing: AI writes the code, software is basically a solved problem, and the engineer is on the way out.

The whole promise is speed: more code, faster. And yet there we were, supposedly at the frontier of all this, shipping at that rate. If anything, we looked slow.

So either we were doing it wrong, or AI was giving us speed somewhere else. Not in lines of code, but in another form.

And the shape of that form was exactly what we learned when we reflected on the project. That form was not agentic speed, but agentic sense.

To explain this better, let’s look at what we spent those three months actually doing.

The hard part is the system, not the code

On any serious brownfield system, the code is the easy bit. The hard bit is the system, and our client’s platform was a beast: business logic stacked on infrastructure stacked on more business logic, and, on top of that, the layers of complexity any heavily used platform accumulates over years in production. Adding even “simple” logic in such a system is rarely simple.

A one-line change can have a blast radius you won’t see until production, because you have to account for a dozen nuances you didn’t write, in a context you weren’t around to witness being engineered. Touch the wrong thing in a payments path, for instance, and you now have a compliance problem.

The cost of getting it wrong, at this scale, is enormous.

So our actual work, for most of those three months, was understanding and capturing context. We pointed agents at the codebase and had them index it, trace flows from the UI down to the database, and map dependencies, so we could see what really touched what.

We had agents read discovery-call recordings, scope documents, and Confluence pages. All of which is the context that mostly lives in people’s heads, so that downstream an engineer could make an informed tradeoff without waiting for the next sync with the client.

Across the engagement, our engineers spent something like 70% of their time on understanding and judgment, and 30% on writing code.

That ratio, we realized, is the point.

AI was giving us speed not in terms of code, but in terms of comprehension.

Humans bide, agents swoop

Notice what that means. We weren’t using AI tools as a faster autocomplete, typing the same code with fewer keystrokes. We were using LLMs to understand the system.

We had to hold the whole system in our heads well enough to know which thirty lines to write.

In chess there is an approach called a bide: a long sequence of passive-seeming moves that culminate in a bold attack. Most of the engagement looked like that: humans biding, and then the agent swooping down when the time was right.

Why so little code was enough

Once you genuinely understand the system, hand-writing the code is trivial. You could do it without an LLM at all. Typing code into a screen was never, then, the bottleneck or the metric.

The real metric you aim for is the business impact of your code.

With deeper agent-powered comprehension what materially changes is the value of each line: when you know the system that well, every line you write is the right line. No wrong turns to rip out next week. No scaffolding built “just in case.” No rework when a quiet change three modules away breaks something.

That is also where the timeline went.

We did not move faster line by line. We stopped producing lines that did not need to exist.

Most of the original nine-month estimate, it turned out, was the cost of not understanding the system: the wrong approaches, the rewrites, the careful change that breaks something anyway.

That is exactly the cost agentic context capture removes. When agents put the whole system in your head up front — the code, the flows, the context trapped in calls and documents — the wrong approaches and rewrites never happen. The calendar collapses on its own.

Measure AI by how much code it writes, or how many tools your team has rolled out, and you are counting activity, not outcomes.

Instead, measure AI by how quickly it gets you to points of judgment.

Which is why the de rigueur codegen framing has it backwards, at least for the kind of software most of the world runs, and runs on: old, complex, and unforgiving.

In our context, writing the code was never going to be the impossible part. Understanding a five-million-line system you have never seen, well enough to change it safely, fast enough to ship in three months, instead of nine, now that is impossible without LLMs. And all in the service of moving a metric that really matters to the client.

The thirty lines were easy. Knowing they were the right thirty was the whole job.

If you’re facing a complex brownfield system and need to ship into it safely and fast, book a demo to see how we approach these projects.