AI code needs production debugging, Lightrun report finds
Lightrun has released its State of AI-Powered Engineering Report 2026, which found that 43% of AI-generated code requires manual debugging in production.
The report is based on an independent survey of 200 Site Reliability Engineering and DevOps leaders in the US, UK and EU, including directors, vice presidents and C-level executives at large companies. It argues that AI coding assistants and AI Site Reliability Engineers still lack the real-time production visibility teams need to trust them with autonomous reliability work.
Among the most striking findings, 88% of respondents said companies need two to three manual redeploy cycles to confirm that an AI-generated fix works in production. On average, teams require three manual redeployments to verify a single AI-suggested code fix after deployment.
That manual work consumes a large share of engineering time. Developers spend an average of 38% of their week on debugging, verification and troubleshooting, or roughly two working days, according to the survey.
Trust Gap
The findings also point to weak confidence in the tools used to investigate incidents. Some 77% of engineering leaders said they lack enough confidence in current observability systems to support automated root cause analysis and remediation.
Runtime visibility emerged as a central issue. Sixty per cent of Site Reliability Engineering and DevOps leaders identified a lack of runtime visibility as the main bottleneck in resolving incidents, while 44% said failed investigations by AI Site Reliability Engineers or application performance monitoring tools stemmed from missing execution-level data.
The survey suggests the problem extends beyond tooling to operating practice. In 54% of high-severity incident resolutions, respondents said teams still rely on tribal knowledge rather than diagnostic evidence from AI Site Reliability Engineers or application performance monitoring platforms.
Production Limits
The report reflects a broader tension in software development as the use of AI-generated code increases. Coding assistants may reduce the time needed to write software, but validating behaviour in live systems remains difficult when teams cannot directly observe how code behaves once deployed.
The research found that 97% of engineering leaders believe AI Site Reliability Engineers operate without significant visibility into what is actually happening in production. That underlines a wider concern that AI-based tools can suggest likely causes and remedies, but still struggle to verify them in live environments without deeper execution data.
The report argues that this missing layer includes information such as variable states, memory use and how requests move through a system. Without that context, AI systems are left to infer likely answers rather than confirm them against live behaviour.
That matters because quality checks before release do not appear to eliminate a large share of downstream problems. The survey found that the 43% of AI-generated code requiring manual debugging reached production even after passing QA or staging tests.
Executive View
The study was conducted with research firm Global Surveyz and focused on senior engineering leaders responsible for reliability and operations. Respondents came from enterprises across North America and Europe, reflecting a segment of the market often among the earliest adopters of software automation tools.
Lightrun, which operates in the reliability engineering market, said the report was intended to assess how AI is changing the software development life cycle. The results suggest the bottleneck is shifting from code creation to code verification in live settings.
The survey found that nearly half of AI-generated code still needs production debugging. "Engineering organizations need runtime visibility to embrace the possibilities offered by AI-accelerated engineering. Without this grounding, we aren't slowed by writing code anymore, but by our inability to trust it," said Ilan Peleg, Chief Executive Officer of Lightrun.
"When almost half of AI-generated changes still need debugging in production, we need to fundamentally rethink how we expect our AI agents to solve complex challenges," Peleg added.