Large language model (LLM)-based coding agents increasingly act as autonomous contributors that generate and merge pull requests, yet their real-world effects on software projects are unclear, especially relative to widely adopted IDE-based AI assistants. We present a longitudinal causal study of agent adoption in open-source repositories using staggered difference-in-differences with matched controls. Using the AIDev dataset, we define adoption as the first agent-generated pull request and analyze monthly repository-level outcomes spanning development velocity (commits, lines added) and software quality (static-analysis warnings, cognitive complexity, duplication, and comment density). Results show large, front-loaded velocity gains only when agents are the first observable AI tool in a project; repositories with prior AI IDE usage experience minimal or short-lived throughput benefits. In contrast, quality risks are persistent across settings, with static-analysis warnings and cognitive complexity rising roughly 18% and 35%, indicating sustained agent-induced complexity debt even when velocity advantages fade. These heterogeneous effects suggest diminishing returns to AI assistance and highlight the need for quality safeguards, provenance tracking, and selective deployment of autonomous agents. Our findings establish an empirical basis for understanding how agentic and IDE-based tools interact, and motivate research on balancing acceleration with maintainability in AI-integrated development workflows.
翻译:基于大语言模型(LLM)的编码智能体日益成为能够生成并合并拉取请求的自主贡献者,但其对软件项目的实际影响尚不明确,尤其是在与广泛采用的 IDE 内置 AI 助手相比时。我们采用交错双重差分法结合匹配控制组,对开源代码库中智能体的采用情况进行了纵向因果研究。利用 AIDev 数据集,我们将首次出现智能体生成的拉取请求定义为采用事件,并分析了月度仓库层面的多项结果指标,涵盖开发速度(提交次数、新增代码行数)与软件质量(静态分析警告、认知复杂度、重复率及注释密度)。结果表明,仅当智能体作为项目中首个可观测的 AI 工具时,才会出现显著且集中于前期的速度提升;而对于已使用 AI IDE 的代码库,其吞吐量收益微乎其微或难以持续。相比之下,质量风险在所有场景中持续存在:静态分析警告与认知复杂度分别上升约 18% 与 35%,表明即使速度优势消退后,智能体引发的复杂性债务仍将持续累积。这些异质性效应揭示了 AI 辅助的收益递减规律,凸显了建立质量保障机制、溯源追踪体系以及选择性部署自主智能体的必要性。本研究为理解智能体与 IDE 基工具间的交互关系提供了实证基础,并推动在 AI 集成开发流程中平衡开发速度与可维护性的相关研究。