On the Reliability and Explainability of Automated Code Generation Approaches

Automatic code generation, the task of generating new code snippets from existing code or comments, has long been of interest. Numerous code generation models have been proposed and proven on different benchmark datasets. However, little is known about whether this objective has been achieved and why code generation models effectively transform code sequences automatically. In other words, can we totally trust these automated code generation models? Consequently, there is a pressing need to understand the inner logic of code generation models and to investigate their replicability, reliability, and explainability. To bridge these research gaps, we conduct a thorough empirical study of five code generation models on four representative code generation datasets to assess the limits and capabilities of automatic code generation approaches. We further employ advanced explainable AI approaches to highlight the tokens that significantly contribute to the code generation. Experiments demonstrate that we successfully replicate state-of-the-art code generation approaches. We discover that state-of-the-art approaches suffer from severe data duplication and input insensitivity, which are subtle issues with significant implications. Our explainability analysis reveals that, in various experimental scenarios, code generation models can recognize code grammar and structural information, but can not capture key tokens that need to be updated. Our results draw several lessons and guidelines for future work in this area.

翻译：自动代码生成，即从现有代码或注释生成新代码片段的任务，长期以来备受关注。目前已提出众多代码生成模型，并在不同基准数据集上得到验证。然而，关于这一目标是否真正实现，以及代码生成模型为何能有效自动转换代码序列，我们知之甚少。换言之，我们能否完全信任这些自动代码生成模型？因此，亟需理解代码生成模型的内在逻辑，并探究其可复现性、可靠性与可解释性。为填补这些研究空白，我们对五个代码生成模型在四个代表性代码生成数据集上进行了全面的实证研究，以评估自动代码生成方法的局限与能力。我们进一步采用先进的可解释人工智能方法，突出显示对代码生成具有显著贡献的标记。实验结果表明，我们成功复现了最先进的代码生成方法。我们发现，这些最先进方法存在严重的数据重复和输入不敏感问题，这些微妙问题具有重大影响。我们的可解释性分析揭示，在各种实验场景中，代码生成模型能够识别代码语法和结构信息，但无法捕获需要更新的关键标记。我们的研究结果为该领域的未来工作提供了若干经验教训与指导原则。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日