Autonomous coding agents increasingly contribute to software development by submitting pull requests on GitHub; yet, little is known about how these contributions integrate into human-driven review workflows. We present a large empirical study of agent-authored pull requests using the public AIDev dataset, examining integration outcomes, resolution speed, and review-time collaboration signals. Using logistic regression with repository-clustered standard errors, we find that reviewer engagement has the strongest correlation with successful integration, whereas larger change sizes and coordination-disrupting actions, such as force pushes, are associated with a lower likelihood of merging. In contrast, iteration intensity alone provides limited explanatory power once collaboration signals are considered. A qualitative analysis further shows that successful integration occurs when agents engage in actionable review loops that converge toward reviewer expectations. Overall, our results highlight that the effective integration of agent-authored pull requests depends not only on code quality but also on alignment with established review and coordination practices.
翻译:自主编码智能体越来越多地通过向GitHub提交拉取请求的方式参与软件开发;然而,这些贡献如何融入以人为核心的审查工作流尚不明确。本研究利用公开的AIDev数据集,对智能体提交的拉取请求进行了大规模实证分析,考察了其集成结果、解决速度以及审查阶段的协作信号。通过采用基于仓库聚类的标准误差逻辑回归模型,我们发现审查员的参与度与成功集成具有最强的相关性,而较大的变更规模及破坏协作的行为(如强制推送)则与较低的合并可能性相关。相比之下,当纳入协作信号后,单纯的迭代强度仅能提供有限的解释力。定性分析进一步表明,当智能体能够参与可操作的审查循环并逐步趋近审查员预期时,成功集成更易实现。总体而言,我们的研究结果表明,智能体提交的拉取请求能否有效集成,不仅取决于代码质量,更取决于其与既有审查及协作规范的契合程度。