Evaluating and Optimizing the Effectiveness of Neural Machine Translation in Supporting Code Retrieval Models: A Study on the CAT Benchmark

from arxiv, Accepted as Full Paper in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM), Birmingham, UK, October 2023

Neural Machine Translation (NMT) is widely applied in software engineering tasks. The effectiveness of NMT for code retrieval relies on the ability to learn from the sequence of tokens in the source language to the sequence of tokens in the target language. While NMT performs well in pseudocode-to-code translation, it might have challenges in learning to translate from natural language query to source code in newly curated real-world code documentation/ implementation datasets. In this work, we analyze the performance of NMT in natural language-to-code translation in the newly curated CAT benchmark that includes the optimized versions of three Java datasets TLCodeSum, CodeSearchNet, Funcom, and a Python dataset PCSD. Our evaluation shows that NMT has low accuracy, measured by CrystalBLEU and Meteor metrics in this task. To alleviate the duty of NMT in learning complex representation of source code, we propose ASTTrans Representation, a tailored representation of an Abstract Syntax Tree (AST) using a subset of non-terminal nodes. We show that the classical approach NMT performs significantly better in learning ASTTrans Representation over code tokens with up to 36% improvement on Meteor score. Moreover, we leverage ASTTrans Representation to conduct combined code search processes from the state-of-the-art code search processes using GraphCodeBERT and UniXcoder. Our NMT models of learning ASTTrans Representation can boost the Mean Reciprocal Rank of these state-of-the-art code search processes by up to 3.08% and improve 23.08% of queries' results over the CAT benchmark.

翻译：神经机器翻译（NMT）广泛应用于软件工程任务。NMT在代码检索中的有效性取决于其从源语言词元序列到目标语言词元序列的学习能力。尽管NMT在伪代码到代码的翻译中表现良好，但在处理全新构建的真实世界代码文档/实现数据集时，从自然语言查询到源代码的翻译学习可能面临挑战。本研究基于新构建的CAT基准（包含TLCodeSum、CodeSearchNet、Funcom三个Java数据集的优化版本及Python数据集PCSD），分析了NMT在自然语言到代码翻译中的性能表现。评估结果表明，使用CrystalBLEU和Meteor指标衡量时，NMT在此任务中准确率较低。为减轻NMT学习源代码复杂表征的负担，我们提出ASTTrans表示——一种通过选取抽象语法树（AST）非终结节点子集定制的表示方法。实验证明，经典NMT方法在学习ASTTrans表示时的效果显著优于直接学习代码词元，Meteor分数最高提升36%。此外，我们利用ASTTrans表示，结合GraphCodeBERT和UniXcoder等前沿代码搜索方法开展联合代码搜索。基于ASTTrans表示学习的NMT模型可将这些先进代码搜索方法的平均倒数排名（Mean Reciprocal Rank）最高提升3.08%，并在CAT基准上改善23.08%的查询结果。