Method names are crucial to program comprehension and maintenance. Recently, many approaches have been proposed to automatically recommend method names and detect inconsistent names. Despite promising, their results are still sub-optimal considering the three following drawbacks: 1) These models are mostly trained from scratch, learning two different objectives simultaneously. The misalignment between two objectives will negatively affect training efficiency and model performance. 2) The enclosing class context is not fully exploited, making it difficult to learn the abstract function of the method. 3) Current method name consistency checking methods follow a generate-then-compare process, which restricts the accuracy as they highly rely on the quality of generated names and face difficulty measuring the semantic consistency. In this paper, we propose an approach named AUMENA to AUtomate MEthod NAming tasks with context-aware prompt-tuning. Unlike existing deep learning based approaches, our model first learns the contextualized representation(i.e., class attributes) of PL and NL through the pre-training model, then fully exploits the capacity and knowledge of large language model with prompt-tuning to precisely detect inconsistent method names and recommend more accurate names. To better identify semantically consistent names, we model the method name consistency checking task as a two-class classification problem, avoiding the limitation of previous similarity-based consistency checking approaches. The experimental results reflect that AUMENA scores 68.6%, 72.0%, 73.6%, 84.7% on four datasets of method name recommendation, surpassing the state-of-the-art baseline by 8.5%, 18.4%, 11.0%, 12.0%, respectively. And our approach scores 80.8% accuracy on method name consistency checking, reaching an 5.5% outperformance. All data and trained models are publicly available.
翻译:方法名对于程序理解和维护至关重要。近年来,许多研究提出了自动推荐方法名和检测不一致方法名的技术。尽管取得了令人鼓舞的成果,但由于以下三个缺陷,其结果仍非最优:1)这些模型大多从零开始训练,同时学习两个不同的目标。两个目标之间的错位会负面影响训练效率和模型性能。2)未能充分利用所在类的上下文,导致难以学习方法的抽象功能。3)当前的方法名一致性检查方法遵循“生成-比较”流程,这限制了准确性,因为它们高度依赖生成名称的质量,且在衡量语义一致性方面存在困难。本文提出了一种名为AUMENA的方法,通过上下文感知的提示调优来自动完成方法命名任务。与现有的基于深度学习方法不同,我们的模型首先通过预训练模型学习编程语言和自然语言的上下文表示(即类属性),然后利用提示调优充分挖掘大语言模型的能力和知识,以精确检测不一致的方法名并推荐更准确的名称。为了更好地识别语义一致的名称,我们将方法名一致性检查任务建模为二分类问题,避免了以往基于相似性的一致性检查方法的局限性。实验结果表明,AUMENA在四个方法名推荐数据集上分别达到了68.6%、72.0%、73.6%和84.7%的得分,分别比现有最优基线高出8.5%、18.4%、11.0%和12.0%。我们的方法在方法名一致性检查上达到了80.8%的准确率,提升了5.5%。所有数据和训练模型均已公开。