The productivity gains from AI coding tools are real. But the more interesting thing I learned wasn't about productivity at all — it was about how repeated competence quietly changes the way you trust.
Lessons from weeks of real AI-assisted development at Cyborg using Claude Code, Codex, and similar tools.
At Cyborg, we ship real production software for clients across finance, healthcare, and operations. So when we started pushing AI coding tools harder into our day-to-day engineering work over the past few weeks, the goal was simple: find out where they actually help, where they quietly hurt, and how a serious engineering team should think about working with them.
Not in the "generate a toy app" sense. Actual engineering work:
And honestly, the overall experience was surprisingly positive.
In supervised workflows, these tools are genuinely good at:
This was never a fully autonomous setup where specifications were dumped into an AI and left unattended. We still handled:
The AI acted more like a fast implementation partner than an autonomous engineer. And with that structure in place, it worked much better than we initially expected.
Most implementation tasks actually went reasonably well. The more interesting failures happened around:
For example, during one debugging session, a production widget stopped loading correctly. The AI initially diagnosed the issue as a frontend caching problem and modified cache-busting logic in the loader.
Technically, the change was valid. The problem was that caching had nothing to do with the actual issue.
Later in the same session, the AI replaced an internal API abstraction with a more generic implementation because the generic pattern looked more familiar. Except the abstraction already existed intentionally and likely handled concerns such as:
Eventually, after several rounds of investigation, the real root cause turned out to be something entirely outside the repository logic — a required database migration had not been applied in production.
Individually, these were manageable mistakes. But collectively, they revealed an interesting pattern: the AI was often very strong at local implementation reasoning while still weaker at broader operational context.
And honestly, that distinction matters much more in real systems than most AI demos suggest.
One particular workflow made this even clearer. The repository already had:
The task itself sounded simple:
As part of these workflow experiments, we had intentionally allowed the AI assistant to perform certain repository operations directly, so we could better understand how far supervised AI-assisted workflows could practically go.
At one point, the assistant ran a hard reset on the branch — git reset --hard.
The command technically solved one objective: it removed the commit from the branch. But it also wiped the uncommitted working tree changes.
Fortunately, this did not turn into catastrophic loss. Important workflows were already being handled carefully — backups existed, Git flow was still supervised, implementation structure remained controlled. So practically, it became more of a learning moment than a disaster.
Still, the incident exposed something important. The AI had enough information available to avoid the mistake. The working tree was dirty. Modified files were visible. The instruction was to preserve existing work.
Yet it optimized primarily for "move the commit off the branch" without fully protecting "preserve unrelated working tree state."
That distinction stayed with us long after the debugging session ended.
The biggest thing we learned was not that AI makes mistakes. Humans make mistakes constantly too.
The more interesting realization was how repeated successful interactions gradually changed how we reviewed the AI's work. After enough correct implementations, useful debugging assistance, and productive iterations, we noticed ourselves moving from "verify every operational step carefully" toward "this is probably fine."
Not because we stopped caring. And not because the AI became perfect. But because repeated competence naturally builds trust.
That is where AI-assisted development becomes fundamentally different from traditional tooling.
The risk is not blind automation. The risk is gradual trust transfer.
Traditional tools do not simulate reasoning. AI systems do.
They explain themselves. Justify decisions. Navigate repositories fluently. Sound confident. And often succeed repeatedly.
After enough successful interactions, humans naturally compress their review process. That is not irrational behavior — it is how humans interact with capable systems everywhere: senior engineers, CI pipelines, deployment systems, cloud infrastructure, automation tooling.
AI simply accelerates this effect because it combines speed, fluency, confidence, and partial correctness extremely well.
After working this way for weeks, our conclusion became surprisingly simple:
AI is becoming very good at driving implementation. Humans still need to own system state.
That includes:
In other words — AI can increasingly help execute engineering work, but humans still need to remain custodians of system integrity. At least for now.
Interestingly, these experiences did not make us stop using AI coding tools. If anything, we probably use them more now. But the workflow evolved.
We trust AI heavily for acceleration, exploration, iterative implementation, debugging assistance, and repository navigation.
But we now treat operationally sensitive actions differently — branch manipulation, resets, migrations, infrastructure changes, production-impacting decisions. Those areas still deserve slower human supervision.
Not because AI is incapable. But because operational awareness is still very different from implementation capability.
And honestly, we think that is the real lesson many engineering teams are currently learning as AI-assisted development becomes normal.
The future probably is not "AI replaces engineers." It is more likely "AI increasingly drives execution while humans remain responsible for state, boundaries, and operational judgment."
So far, that combination is already surprisingly powerful.
Working through similar questions in your own engineering function? We help teams adopt AI tools without giving up operational rigor. Learn more about our AI Implementation Services →