Claude Code in Production: Velocity Gains From Real Engineering Team

Welcome, Developer 👋

Six months ago I ran a pilot. I wanted to know if AI tools could actually change how my team of engineers and QAs ships software, or if this was just another hype cycle.

The results changed how I lead.

Based on a true story. Here’s what happened.

Documentation went from 3 hours to 1

Technical documentation used to eat half a day. Now my team drafts it with Claude Web and reviews it in an hour. Same depth. Same quality bar. A third of the time.

That’s not a small win. That’s hours back every week, per person, that used to disappear into writing instead of building.

Architecture decisions got sharper before they ever reached the org

Before any solution design goes in front of stakeholders, we run it through Claude first. We pressure test the approach, argue the tradeoffs, poke holes in our own thinking.

By the time a proposal reaches the wider organization, it’s already survived a round of scrutiny. Fewer surprises in review. Faster approvals. Better decisions.

Claude Code changed what a sprint can hold

This is where it gets real.

My team now uses Claude Code to build features, write automated tests, write end-to-end test scenarios, and even draft Jira stories following proper given-when-then format. What used to take back and forth between engineering and QA now happens inside a single flow.

APIs that used to take 2.5 sprints, complete with documentation, unit tests, and supertest coverage, now ship in one.

Entire app pages that used to take a couple of days now ship in one.

I want to be clear about what that means. It doesn’t mean less rigor. It means the rigor happens faster.

Figma MCP made pixel-perfect UI the default, not the exception

Connecting Figma directly into the development workflow through MCP means UI components come out matching design intent from the start. No more rounds of “move this 2px left.” The gap between design and implementation shrank to almost nothing.

Does it get perfect right away? Not really, but only few tweaks needed.

Infrastructure and CI/CD got faster to write, not riskier to run

Complex Terraform scripts. GitHub Actions pipelines for automated deployment, integration, and testing. These are areas where mistakes are expensive, so speed alone was never the goal.

What changed is how fast we can get to a first solid draft, and how much faster we can review it against our own standards. The infrastructure work still gets the same scrutiny it always did. It just starts from a better place.

Skills and a custom plugin turned tribal knowledge into something Claude could use

The generic version of Claude Code is good. The version that knows our stack is better.

We built out Claude skills for our recurring workflows, and went a step further with a custom plugin tailored to our own app and APIs. Instead of re-explaining our conventions, our data models, and how our services talk to each other every single time, that context now lives in the tooling itself.

The difference shows up immediately. Less back and forth. Fewer wrong assumptions. Output that already looks like it was written by someone who’s worked in our codebase for years, because in a sense, it has.

None of this stays useful on its own. My team reviews our skills and coding standards regularly, updating them as our architecture evolves and as we learn what works and what doesn’t. Skills that go stale produce stale output. Treating them as a living part of the codebase, not a one-time setup, is what keeps Claude Code sharp instead of just fast.

We’re building AI into the product too

It’s not just internal tooling. My team is building AI features directly into our product using RAG applications with the Anthropic SDK. We’re not just consumers of AI assistance anymore. We’re shipping AI capability to our own users.

What didn’t change, and this matters more than the speed

Code reviews are still done by humans. Nothing merges to main without a human approving it on a pull request. That line hasn’t moved and it isn’t going to.

No code or solution proposed by Claude gets accepted without being judged first. My engineers and QAs are still the ones responsible for quality and security standards. AI proposes. People decide.

There’s a second-order effect I didn’t expect. Junior developers are learning faster by reviewing AI-generated changes in code review. They’re seeing patterns, tradeoffs, and edge cases surface in a single PR that used to take months of exposure to encounter naturally. It’s become an accelerant for growth, not a shortcut around it.

Remember, “be quick, but don’t hurry”.

The real win isn’t speed. It’s what speed buys back

When documentation, boilerplate, and repetitive test scaffolding stop eating the day, something else opens up. Time to try new technologies. Time to propose real innovation instead of defending against the backlog. Time for people to be creative, to learn, to think past the next ticket.

That’s the part I didn’t expect going into this pilot. I thought I was measuring velocity. I was actually measuring how much space my team had left to think.

As a leader, that’s the number I care about most.

Stay focused, Developer!