AI IDEs or Autonomous Agents? Measuring the Impact of Coding Agents on Software Development

Large language model (LLM) based coding agents increasingly act as autonomous contributors that generate and merge pull requests, yet their real-world effects on software projects are unclear-especially compared with widely adopted IDE-based AI assistants. We present a longitudinal causal study of agent adoption in open-source repositories using staggered difference-in-differences with matched controls. Using the AIDev dataset, we define adoption as the first agent-generated pull request and analyze monthly repository-level outcomes spanning development velocity (commits, lines added) and software quality (static-analysis warnings, cognitive complexity, duplication, and comment density). Results show large, front-loaded velocity gains only when agents are the first observable AI tool in a project; repositories with prior AI IDE usage experience minimal or short-lived throughput increases. In contrast, quality risks are persistent across settings, with static-analysis warnings and cognitive complexity rising by roughly 18% and 39%, indicating sustained agent-induced technical debt even when velocity advantages fade. These heterogeneous effects suggest diminishing returns to AI assistance and highlight the need for quality safeguards, provenance tracking, and selective deployment of autonomous agents. Our findings establish an empirical basis for understanding how agentic and IDE-based tools interact, and motivate research on balancing acceleration with maintainability in AI-integrated development workflows. The replication package for this study is publicly available at https://github.com/shyamagarwal13/agentic-coding-impact.

翻译：基于大语言模型（LLM）的编码智能体日益成为能够生成并合并拉取请求的自主贡献者，但其对软件项目的实际影响尚不明确——尤其是与广泛采用的基于集成开发环境（IDE）的AI助手相比。本研究采用交错双重差分法与匹配对照组，对开源代码库中智能体的采纳情况进行了纵向因果分析。利用AIDev数据集，我们将“采纳”定义为首次出现由智能体生成的拉取请求，并分析了月度代码库层面的多项指标，涵盖开发效率（提交次数、新增代码行数）与软件质量（静态分析警告、认知复杂度、重复率及注释密度）。结果显示，仅当智能体作为项目中首个可观测的AI工具时，才会产生显著且集中于前期的效率提升；而先前已使用AI IDE的代码库仅获得极小或短暂的生产力增长。相比之下，质量风险在所有场景中持续存在：静态分析警告与认知复杂度分别上升约18%与39%，表明即使效率优势消退，智能体引发的技术债务仍将持续。这些异质性效应揭示了AI辅助的收益递减规律，凸显了建立质量保障机制、溯源追踪体系以及选择性部署自主智能体的必要性。本研究为理解智能体与基于IDE的工具如何交互提供了实证基础，并推动在AI集成开发工作流中平衡加速开发与可维护性的相关研究。本研究的复现材料已公开于 https://github.com/shyamagarwal13/agentic-coding-impact。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

迈向个性化大语言模型驱动的智能体：基础、评估与未来方向

专知会员服务

28+阅读 · 2月27日

智能体评判者（Agent-as-a-Judge）研究综述

专知会员服务

37+阅读 · 1月9日

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

43+阅读 · 1月8日

【AAAI2026】AutoTool：面向大语言模型智能体的高效工具选择方法

专知会员服务

19+阅读 · 2025年11月19日