MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution

In software evolution, resolving the emergent issues within GitHub repositories is a complex challenge that involves not only the incorporation of new code but also the maintenance of existing functionalities. Large Language Models (LLMs) have shown promise in code generation and understanding but face difficulties in code change, particularly at the repository level. To overcome these challenges, we empirically study the reason why LLMs mostly fail to resolve GitHub issues and analyze some impact factors. Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four kinds of agents customized for the software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents. This framework leverages the collaboration of various agents in the planning and coding process to unlock the potential of LLMs to resolve GitHub issues. In experiments, we employ the SWE-bench benchmark to compare MAGIS with popular LLMs, including GPT-3.5, GPT-4, and Claude-2. MAGIS can resolve 13.94% GitHub issues, which significantly outperforms the baselines. Specifically, MAGIS achieves an eight-fold increase in resolved ratio over the direct application of GPT-4, the based LLM of our method. We also analyze the factors for improving GitHub issue resolution rates, such as line location, task allocation, etc.

翻译：在软件演化中，解决GitHub仓库中涌现的问题是复杂挑战，这不仅涉及新代码的整合，还需维护现有功能。大型语言模型（LLM）在代码生成与理解方面展现出潜力，但在代码变更（尤其是仓库级变更）中面临困难。为克服这些挑战，我们通过实证研究探讨了LLM未能解决GitHub问题的根本原因，并分析了若干影响因素。基于实证结果，我们提出一种新颖的基于LLM的多智能体框架MAGIS，其包含四类为软件演化定制的智能体：管理者、仓库管理员、开发者与质量保证工程师。该框架通过规划与编码过程中多智能体的协作，释放LLM解决GitHub问题的潜力。实验中，我们采用SWE-bench基准将MAGIS与主流LLM（包括GPT-3.5、GPT-4和Claude-2）进行对比。MAGIS可解决13.94%的GitHub问题，显著优于基线方法。具体而言，相较于直接应用其基础LLM（GPT-4），MAGIS的问题解决率提升达八倍。我们还分析了提升GitHub问题解决率的因素，如代码行定位、任务分配等。

相关内容

GitHub

关注 88

http://GitHub.com 使用 Git 作为版本控制系统（version control system）提供在线源码托管的服务，同时是个有社交功能的开发者社区。国外类似服务： http://Bitbucket.com
http://Gitlab.com
国内类似服务：
http://Coding.net

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日