Coding agents are becoming increasingly capable of completing end-to-end software engineering workflows that previously required a human developer, including raising pull requests (PRs) to propose their changes. However, we still know little about how these agents use libraries when generating code, a core part of real-world software development. To fill this gap, we study 26,760 agent-authored PRs from the AIDev dataset to examine three questions: how often do agents import libraries, how often do they introduce new dependencies (and with what versioning), and which specific libraries do they choose? We find that agents often import libraries (29.5% of PRs) but rarely add new dependencies (1.3% of PRs); and when they do, they follow strong versioning practices (75.0% specify a version), an improvement on direct LLM usage where versions are rarely mentioned. Generally, agents draw from a surprisingly diverse set of external libraries, contrasting with the limited "library preferences" seen in prior non-agentic LLM studies. Our findings offer an early empirical view on how AI coding agents interact with today's software ecosystems.
翻译:编码智能代理正日益能够完成端到端的软件工程工作流程,这些流程以往需要人类开发者完成,包括发起拉取请求(PRs)以提交其代码变更。然而,我们对于这些代理在生成代码时如何使用库——这一现实软件开发的核心环节——仍知之甚少。为填补这一空白,我们基于AIDev数据集中的26,760个由代理发起的PRs,研究了三个问题:代理导入库的频率如何?它们引入新依赖的频率(及版本控制情况)如何?以及它们具体选择哪些库?我们发现,代理经常导入库(占PRs的29.5%),但很少添加新依赖(占PRs的1.3%);当它们添加时,遵循严格的版本控制实践(75.0%指定了版本),这相较于直接使用大语言模型(LLM)时很少提及版本的情况有所改进。总体而言,代理使用的外部库集合具有惊人的多样性,这与先前非代理性LLM研究中观察到的有限“库偏好”形成对比。我们的研究结果为理解AI编码代理如何与当今软件生态系统交互提供了早期的实证视角。