Recently, pretrained language models have shown state-of-the-art performance on the vulnerability detection task. These models are pretrained on a large corpus of source code, then fine-tuned on a smaller supervised vulnerability dataset. Due to the different training objectives and the performance of the models, it is interesting to consider whether the models have learned the semantics of code relevant to vulnerability detection, namely bug semantics, and if so, how the alignment to bug semantics relates to model performance. In this paper, we analyze the models using three distinct methods: interpretability tools, attention analysis, and interaction matrix analysis. We compare the models' influential feature sets with the bug semantic features which define the causes of bugs, including buggy paths and Potentially Vulnerable Statements (PVS). We find that (1) better-performing models also aligned better with PVS, (2) the models failed to align strongly to PVS, and (3) the models failed to align at all to buggy paths. Based on our analysis, we developed two annotation methods which highlight the bug semantics inside the model's inputs. We evaluated our approach on four distinct transformer models and four vulnerability datasets and found that our annotations improved the models' performance in the majority of settings - 11 out of 16, with up to 9.57 points improvement in F1 score compared to conventional fine-tuning. We further found that with our annotations, the models aligned up to 232% better to potentially vulnerable statements. Our findings indicate that it is helpful to provide the model with information of the bug semantics, that the model can attend to it, and motivate future work in learning more complex path-based bug semantics. Our code and data are available at https://figshare.com/s/4a16a528d6874aad51a0.
翻译:最近,预训练语言模型在漏洞检测任务上展现了最先进的性能。这些模型首先在大型源代码语料库上进行预训练,随后在小规模监督漏洞数据集上进行微调。由于训练目标和模型性能的差异,探究模型是否学习了与漏洞检测相关的代码语义(即漏洞语义),以及这种语义对齐与模型性能之间的关系,具有重要研究价值。本文采用三种不同方法分析模型:可解释性工具、注意力分析和交互矩阵分析。我们将模型的影响特征集与定义漏洞成因的漏洞语义特征(包括缺陷路径和潜在漏洞语句PVS)进行对比。研究发现:(1) 性能更优的模型与PVS的对齐程度更高;(2) 模型未能与PVS实现强对齐;(3) 模型完全未能与缺陷路径对齐。基于分析结果,我们开发了两种在模型输入中突出显示漏洞语义的标注方法。我们在四种不同Transformer模型和四个漏洞数据集上评估了该方法,发现在多数场景(16组实验中的11组)下,标注方法提升了模型性能,其中F1分数相比传统微调最高提升9.57个百分点。进一步研究表明,使用标注后,模型与潜在漏洞语句的对齐程度最高提升232%。我们的发现表明:向模型提供漏洞语义信息具有积极作用,模型能够关注此类信息,并为未来学习更复杂的基于路径的漏洞语义研究提供了方向。代码与数据获取地址:https://figshare.com/s/4a16a528d6874aad51a0。