Exploring and Unleashing the Power of Large Language Models in Automated Code Translation

Code translation tools (transpilers) are developed for automatic source-to-source translation. Although learning-based transpilers have shown impressive enhancement against rule-based counterparts, owing to their task-specific pre-training on extensive monolingual corpora. Their current performance still remains unsatisfactory for practical deployment, and the associated training resources are also prohibitively expensive. LLMs pre-trained on huge amounts of human-written code/text have shown remarkable performance in many code intelligence tasks due to their powerful generality, even without task-specific training. Thus, LLMs can potentially circumvent the above limitations, but they have not been exhaustively explored yet. This paper investigates diverse LLMs and learning-based transpilers for automated code translation tasks, finding that: although certain LLMs have outperformed current transpilers, they still have some accuracy issues, where most of the failures are induced by a lack of comprehension of source programs, missing clear instructions on I/O types in translation, and ignoring discrepancies between source and target programs. Enlightened by the above findings, we further propose UniTrans, a Unified code Translation framework, applicable to various LLMs, for unleashing their power in this field. Specifically, UniTrans first crafts a series of test cases for target programs with the assistance of source programs. Next, it harnesses the above auto-generated test cases to augment the code translation and then evaluate their correctness via execution. Afterward, UniTrans further (iteratively) repairs incorrectly translated programs prompted by test case execution results. Extensive experiments are conducted on six settings of translation datasets between Python, Java, and C++. Three recent LLMs of diverse sizes are tested with UniTrans, and all achieve substantial improvements.

翻译：代码翻译工具（转译器）旨在实现自动化的源到源翻译。尽管基于学习的转译器因在大量单语语料库上进行任务特定预训练，相比基于规则的方法展现出显著提升，但其当前性能在实际部署中仍不尽人意，且相关训练资源成本过高。在海量人工编写代码/文本上预训练的大型语言模型（LLMs）凭借其强大的通用性，即便无需任务特定训练，也能在许多代码智能任务中表现卓越。因此，LLMs有望规避上述局限，但尚未得到充分探索。本文系统研究了多种LLMs及基于学习的转译器在自动化代码翻译任务中的表现，发现：尽管某些LLMs已超越现有转译器，但仍存在准确性缺陷，其中多数失败源于对源代码理解不足、翻译中缺乏对输入/输出类型的明确指示，以及忽视源程序与目标程序间的差异。基于上述发现，我们进一步提出UniTrans——一种适用于多种LLMs的统一代码翻译框架，旨在释放其在该领域的潜力。具体而言，UniTrans首先借助源程序为目标程序生成一系列测试用例；其次，利用上述自动生成的测试用例增强代码翻译，并通过执行验证其正确性；然后，UniTrans进一步根据测试用例执行结果（迭代式）修复翻译错误的程序。在Python、Java和C++之间六组翻译数据集上开展了广泛实验，三种不同规模的最新LLMs在UniTrans框架下均实现了显著提升。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日