Code differencing is a fundamental technique in software engineering practice and research. While researchers have proposed text-based differencing techniques capable of identifying line changes over the past decade, existing methods exhibit a notable limitation in identifying edit actions (EAs) that operate on text blocks spanning multiple lines. Such EAs are common in developers' practice, such as moving a code block for conditional branching or duplicating a method definition block for overloading. Existing tools represent such block-level operations as discrete sequences of line-level EAs, compelling developers to manually correlate them and thereby substantially impeding the efficiency of change comprehension. To address this issue, we propose BDiff, a text-based differencing algorithm capable of identifying two types of block-level EAs and five types of line-level EAs. Building on traditional differencing algorithms, we first construct a candidate set containing all possible line mappings and block mappings. Leveraging the Kuhn-Munkres algorithm, we then compute the optimal mapping set that can minimize the size of the edit script (ES) while closely aligning with the original developer's intent. To validate the effectiveness of BDiff, we selected five state-of-the-art tools, including large language models (LLMs), as baselines and adopted a combined qualitative and quantitative approach to evaluate their performance in terms of ES size, result quality, and running time. Experimental results show that BDiff produces higher-quality differencing results than baseline tools while maintaining competitive runtime performance. Our experiments also show the unreliability of LLMs in code differencing tasks regarding result quality and their infeasibility in terms of runtime efficiency. We have implemented a web-based visual differencing tool.
翻译:代码差异分析是软件工程实践与研究中的一项基础技术。尽管过去十年间研究者已提出多种能够识别行级变更的基于文本的差异分析方法,但现有方法在识别作用于跨多行文本块的编辑操作方面存在显著局限。此类编辑操作在开发实践中十分常见,例如为条件分支移动代码块,或为方法重载复制方法定义块。现有工具将此类块级操作表示为离散的行级编辑操作序列,迫使开发者手动关联这些操作,从而严重阻碍了变更理解的效率。为解决这一问题,我们提出BDiff——一种能够识别两种块级编辑操作和五种行级编辑操作的基于文本的差异分析算法。基于传统差异分析算法,我们首先构建包含所有可能行映射与块映射的候选集。利用Kuhn-Munkres算法,我们随后计算能够最小化编辑脚本规模,同时与开发者原始意图高度契合的最优映射集。为验证BDiff的有效性,我们选取了包括大语言模型在内的五种先进工具作为基线,采用定性与定量相结合的方法,从编辑脚本规模、结果质量和运行时间三个维度评估其性能。实验结果表明,BDiff在保持具有竞争力的运行时性能的同时,能够产生比基线工具更高质量的差异分析结果。我们的实验同时揭示了大语言模型在代码差异分析任务中结果质量的不可靠性,以及其在运行效率方面的不可行性。我们已实现基于Web的可视化差异分析工具。