Code translation transforms source code from one programming language (PL) to another. Validating the functional equivalence of translation and repairing, if necessary, are critical steps in code translation. Existing automated validation and repair approaches struggle to generalize to many PLs due to high engineering overhead, and they rely on existing and often inadequate test suites, which results in false claims of equivalence and ineffective translation repair. To bridge this gap, we develop MatchFixAgent, a large language model (LLM)-based, PL-agnostic framework for equivalence validation and repair of translations. MatchFixAgent features a multi-agent architecture that divides equivalence validation into several sub-tasks to ensure thorough and consistent semantic analysis of the translation. We compare MatchFixAgent's validation and repair results with four repository-level code translation techniques. Our results demonstrate that MatchFixAgent produces (in)equivalence verdicts for 99.2% of translation pairs, with the same equivalence validation result as prior work on 72.8% of them. When MatchFixAgent's result disagrees with prior work, we find that 60.7% of the time MatchFixAgent's result is actually correct. In addition, we show that MatchFixAgent can repair 50.6% of inequivalent translation, compared to prior work's 18.5%.
翻译:代码翻译是将源代码从一种编程语言(PL)转换为另一种编程语言的过程。验证翻译的功能等价性以及必要时进行修复,是代码翻译中的关键步骤。现有的自动验证和修复方法由于工程开销高而难以泛化到多种编程语言,且依赖于现有且往往不充分的测试套件,这导致产生虚假的等价性声明以及无效的翻译修复。为弥补这一差距,我们开发了MatchFixAgent——一种基于大语言模型(LLM)、与编程语言无关的翻译等价性验证与修复框架。MatchFixAgent采用多智能体架构,将等价性验证分解为若干子任务,以确保对翻译进行全面且一致的语义分析。我们将MatchFixAgent的验证与修复结果与四种仓库级代码翻译技术进行了比较。结果表明,MatchFixAgent对99.2%的翻译对生成了(非)等价性判定,其中72.8%的翻译对的等价性验证结果与先前工作一致。当MatchFixAgent的结果与先前工作不一致时,我们发现60.7%的情况下MatchFixAgent的结果实际上是正确的。此外,我们证明MatchFixAgent能够修复50.6%的非等价翻译,而先前工作的修复率仅为18.5%。