Repository-Level Compositional Code Translation and Validation

Code translation transforms programs from one programming language (PL) to another. Several rule-based transpilers have been designed to automate code translation between different pairs of PLs. However, the rules can become obsolete as the PLs evolve and cannot generalize to other PLs. Recent studies have explored the automation of code translation using Large Language Models (LLMs). One key observation is that such techniques may work well for crafted benchmarks but fail to generalize to the scale and complexity of real-world projects with dependencies, custom types, PL-specific features, etc. We propose AlphaTrans, a neuro-symbolic approach to automate repository-level code translation. AlphaTrans translates both source and test code, and employs multiple levels of validation to ensure the translation preserves the functionality of the source program. To break down the problem for LLMs, AlphaTrans leverages program analysis to decompose the program into fragments and translates them in the reverse call order. We leveraged AlphaTrans to translate ten real-world open-source projects consisting of <836, 8575, 2719> classes, methods, and tests. AlphaTrans translated the entire repository of these projects consisting of 6899 source code fragments. 99.1% of the translated code fragments are syntactically correct, and AlphaTrans validates the translations' runtime behavior and functional correctness for 25.8%. On average, the integrated translation and validation take 36 hours to translate a project, showing its scalability in practice. For the syntactically or semantically incorrect translations, AlphaTrans generates a report including existing translation, stack trace, test errors, or assertion failures. We provided these artifacts to two developers to fix the translation bugs in four projects. They were able to fix the issues in 20.1 hours on average and achieve all passing tests.

翻译：代码翻译将程序从一种编程语言转换为另一种编程语言。目前已设计出多种基于规则的转译器，用于在不同编程语言对之间实现代码翻译的自动化。然而，随着编程语言的演进，这些规则可能变得过时，且无法泛化到其他编程语言。近期研究探索了利用大语言模型实现代码翻译的自动化。一个重要发现是：此类技术可能在精心设计的基准测试中表现良好，但难以泛化到具有依赖项、自定义类型、语言特定特性等复杂因素的真实项目规模。我们提出AlphaTrans——一种神经符号化方法，用于实现仓库级代码翻译的自动化。AlphaTrans同时翻译源代码和测试代码，并采用多级验证机制以确保翻译结果能保持源程序的功能完整性。为降低大语言模型的处理复杂度，AlphaTrans利用程序分析技术将程序分解为代码片段，并按调用逆序进行翻译。我们应用AlphaTrans翻译了十个真实开源项目，其包含<836, 8575, 2719>个类、方法和测试。AlphaTrans成功翻译了这些项目中包含6899个源代码片段的完整仓库。99.1%的翻译代码片段语法正确，AlphaTrans对其中25.8%的翻译结果进行了运行时行为与功能正确性验证。平均每个项目的集成翻译与验证耗时36小时，证明了该方法的实际可扩展性。对于存在语法或语义错误的翻译结果，AlphaTrans会生成包含现有翻译、堆栈轨迹、测试错误或断言失败等信息的报告。我们向两名开发人员提供了四个项目的相关工件以修复翻译缺陷，他们平均耗时20.1小时成功修复问题并使所有测试通过。