The integration of AI agents as coding assistants into software development has raised questions about the long-term viability of AI agent-generated code. A prevailing hypothesis within the software engineering community suggests this code is "disposable", meaning it is merged quickly but discarded shortly thereafter. If true, organizations risk shifting maintenance burden from generation to post-deployment remediation. We investigate this hypothesis through survival analysis of 201 open-source projects, tracking over 200,000 code units authored by AI agents versus humans. Contrary to the disposable code narrative, agent-authored code survives significantly longer: at the line level, it exhibits a 15.8 percentage-point lower modification rate and 16% lower hazard of modification (HR = 0.842, p < 0.001). However, modification profiles differ. Agent-authored code shows modestly elevated corrective rates (26.3% vs. 23.0%), while human code shows higher adaptive rates. However, the effect sizes are small (Cramér's V = 0.116), and per-agent variation exceeds the agent-human gap. Turning to prediction, textual features can identify modification-prone code (AUC-ROC = 0.671), but predicting when modifications occur remains challenging (Macro F1 = 0.285), suggesting timing depends on external organizational dynamics. The bottleneck for agent-generated code may not be generation quality, but the organizational practices that govern its long-term evolution.
翻译:将AI智能体作为编程助手融入软件开发,引发了关于AI智能体生成代码长期生存能力的疑问。软件工程界普遍存在一种假设,认为此类代码具有"一次性"特征,即被快速合并后不久即遭弃用。若此假设成立,组织将面临维护负担从生成阶段转移至部署后修复阶段的风险。我们通过对201个开源项目的生存分析展开研究,追踪了超过20万个由AI智能体与人类编写的代码单元。与"一次性代码"的论述相反,智能体编写的代码存活时间显著更长:在代码行级别,其修改率降低15.8个百分点,修改风险降低16%(风险比HR = 0.842,p < 0.001)。然而,两者的修改特征存在差异:智能体代码的纠错性修改率略高(26.3%对比23.0%),而人类代码的适应性修改率更高。但效应量较小(Cramér's V = 0.116),且智能体间的变异程度超过人机差异。在预测层面,文本特征可识别易修改代码(AUC-ROC = 0.671),但预测修改发生时机仍具挑战性(宏观F1分数 = 0.285),这表明修改时机取决于外部组织动态。AI生成代码的瓶颈可能不在于生成质量,而在于制约其长期演进的组织实践。