In this paper, we present a Java-to-Python (J2P) and Python-to-Java (P2J) back-to-back code translation method, and an associated tool called CoTran, based on large language models (LLMs). Our method leverages the attention mechanism of LLMs, compilation, and symbolic execution-based test generation for equivalence testing between the input and output programs. More precisely, we modify the typical LLM training loop to incorporate compiler and symbolic execution loss. Via extensive experiments comparing CoTran with 12 other transpilers and LLM-based translation tools over a benchmark of more than 57,000 Java-Python equivalent pairs, we show that CoTran outperforms them on relevant metrics such as compilation and runtime equivalence accuracy. For example, our tool gets 97.43% compilation accuracy and 49.66% runtime equivalence accuracy for J2P translation, whereas the nearest competing tool only gets 92.84% and 40.95% respectively.
翻译:本文提出了一种基于大型语言模型(LLMs)的Java到Python(J2P)与Python到Java(P2J)双向代码翻译方法,并开发了相应工具CoTran。该方法融合了LLM的注意力机制、编译技术以及基于符号执行的测试生成技术,用于验证输入程序与输出程序间的等价性。具体而言,我们对标准LLM训练流程进行改进,将编译损失与符号执行损失纳入训练过程。通过将CoTran与12种其他转译器及基于LLM的翻译工具进行对比,基于超过57,000个Java-Python等价对的基准测试表明,CoTran在编译准确率和运行时等价准确率等关键指标上显著领先。例如,在J2P翻译任务中,CoTran取得了97.43%的编译准确率与49.66%的运行时等价准确率,而性能最接近的竞争对手仅分别达到92.84%与40.95%。