Atomizer：一种基于LLM的协作式多智能体框架，用于意图驱动的提交解耦 (Atomizer: An LLM-based Collaborative Multi-Agent Framework for Intent-Driven Commit Untangling)

Composite commits, which entangle multiple unrelated concerns, are prevalent in software development and significantly hinder program comprehension and maintenance. Existing automated untangling methods, particularly state-of-the-art graph clustering-based approaches, are fundamentally limited by two issues. (1) They over-rely on structural information, failing to grasp the crucial semantic intent behind changes, and (2) they operate as ``single-pass'' algorithms, lacking a mechanism for the critical reflection and refinement inherent in human review processes. To overcome these challenges, we introduce Atomizer, a novel collaborative multi-agent framework for composite commit untangling. To address the semantic deficit, Atomizer employs an Intent-Oriented Chain-of-Thought (IO-CoT) strategy, which prompts large language models (LLMs) to infer the intent of each code change according to both the structure and the semantic information of code. To overcome the limitations of ``single-pass'' grouping, we employ two agents to establish a grouper-reviewer collaborative refinement loop, which mirrors human review practices by iteratively refining groupings until all changes in a cluster share the same underlying semantic intent. Extensive experiments on two benchmark C# and Java datasets demonstrate that Atomizer significantly outperforms several representative baselines. On average, it surpasses the state-of-the-art graph-based methods by over 6.0% on the C# dataset and 5.5% on the Java dataset. This superiority is particularly pronounced on complex commits, where Atomizer's performance advantage widens to over 16%.

翻译：复合提交（即包含多个不相关修改的提交）在软件开发中普遍存在，并严重阻碍程序的理解与维护。现有的自动化解耦方法，尤其是最先进的基于图聚类的方案，本质上受限于两个问题：（1）它们过度依赖结构信息，无法把握修改背后关键的语义意图；（2）它们作为“单次通过”算法运行，缺乏人类评审过程中固有的关键反思与优化机制。为克服这些挑战，我们提出了Atomizer，一种用于复合提交解耦的新型协作式多智能体框架。为解决语义缺失问题，Atomizer采用了一种意图导向的思维链策略，该策略提示大语言模型根据代码的结构与语义信息推断每次代码修改的意图。为克服“单次通过”分组的局限性，我们部署了两个智能体以建立分组者-评审者协作优化循环，该循环通过迭代优化分组直至同一簇中的所有修改共享相同的底层语义意图，从而模拟人类评审实践。在两个基准C#和Java数据集上进行的大量实验表明，Atomizer显著优于多个代表性基线方法。平均而言，其在C#数据集上超越最先进的基于图的方法超过6.0%，在Java数据集上超过5.5%。这种优势在复杂提交上尤为明显，Atomizer的性能领先幅度扩大至超过16%。