Large language models (LLMs) are increasingly being integrated into software development processes. The ability to generate code and submit pull requests with minimal human intervention, through the use of autonomous AI agents, is poised to become a standard practice. However, little is known about the practical usefulness of these pull requests and the extent to which their contributions are accepted in real-world projects. In this paper, we empirically study 567 GitHub pull requests (PRs) generated using Claude Code, an agentic coding tool, across 157 diverse open-source projects. Our analysis reveals that developers tend to rely on agents for tasks such as refactoring, documentation, and testing. The results indicate that 83.8% of these agent-assisted PRs are eventually accepted and merged by project maintainers, with 54.9% of the merged PRs are integrated without further modification. The remaining 45.1% require additional changes benefit from human revisions, especially for bug fixes, documentation, and adherence to project-specific standards. These findings suggest that while agent-assisted PRs are largely acceptable, they still benefit from human oversight and refinement.
翻译:大型语言模型正日益融入软件开发流程。通过自主AI智能体生成代码并提交拉取请求的能力,有望成为标准实践。然而,目前对这些拉取请求的实际效用及其在真实项目中的接受程度仍知之甚少。本文通过实证方法研究了157个开源项目中由智能编码工具Claude Code生成的567个GitHub拉取请求。分析表明,开发者倾向于借助智能体完成重构、文档编写和测试等任务。结果显示,83.8%的智能体辅助PR最终被项目维护者接受合并,其中54.9%的合并PR无需修改即可集成。其余45.1%的PR则需要人工修订,特别是在错误修复、文档完善和项目特定标准遵循方面。这些发现表明,虽然智能体辅助PR具有较高的可接受性,但仍需人工监督与优化才能发挥最大价值。