Vulnerability detectors based on deep learning (DL) models have proven their effectiveness in recent years. However, the shroud of opacity surrounding the decision-making process of these detectors makes it difficult for security analysts to comprehend. To address this, various explanation approaches have been proposed to explain the predictions by highlighting important features, which have been demonstrated effective in other domains such as computer vision and natural language processing. Unfortunately, an in-depth evaluation of vulnerability-critical features, such as fine-grained vulnerability-related code lines, learned and understood by these explanation approaches remains lacking. In this study, we first evaluate the performance of ten explanation approaches for vulnerability detectors based on graph and sequence representations, measured by two quantitative metrics including fidelity and vulnerability line coverage rate. Our results show that fidelity alone is not sufficient for evaluating these approaches, as fidelity incurs significant fluctuations across different datasets and detectors. We subsequently check the precision of the vulnerability-related code lines reported by the explanation approaches, and find poor accuracy in this task among all of them. This can be attributed to the inefficiency of explainers in selecting important features and the presence of irrelevant artifacts learned by DL-based detectors.
翻译:基于深度学习(DL)模型的漏洞检测器近年来已证明其有效性。然而,这些检测器决策过程的黑箱特性使得安全分析师难以理解。为解决此问题,研究者提出了多种解释方法,通过突出重要特征来解释预测结果——这些方法在计算机视觉和自然语言处理等其他领域已被证明有效。但遗憾的是,目前仍缺乏对这些解释方法所学习和理解的漏洞关键特征(如细粒度的漏洞相关代码行)进行深入评估。本研究首先评估了十种基于图与序列表示的漏洞检测器解释方法的性能,并通过保真度与漏洞行覆盖率两个量化指标进行测量。结果表明,仅凭保真度不足以评估这些方法,因其在不同数据集和检测器上存在显著波动。我们随后检验了这些解释方法所报告漏洞相关代码行的精确性,发现所有方法在该任务上的准确率均较低。这归因于解释器在选择重要特征方面的低效性,以及DL检测器可能学习到的无关伪影。