In recent years, deep learning (DL)-based methods have been widely used in code vulnerability detection. The DL-based methods typically extract structural information from source code, e.g., code structure graph, and adopt neural networks such as Graph Neural Networks (GNNs) to learn the graph representations. However, these methods fail to consider the heterogeneous relations in the code structure graph, i.e., the heterogeneous relations mean that the different types of edges connect different types of nodes in the graph, which may obstruct the graph representation learning. Besides, these methods are limited in capturing long-range dependencies due to the deep levels in the code structure graph. In this paper, we propose a Meta-path based Attentional Graph learning model for code vulNErability deTection, called MAGNET. MAGNET constructs a multi-granularity meta-path graph for each code snippet, in which the heterogeneous relations are denoted as meta-paths to represent the structural information. A meta-path based hierarchical attentional graph neural network is also proposed to capture the relations between distant nodes in the graph. We evaluate MAGNET on three public datasets and the results show that MAGNET outperforms the best baseline method in terms of F1 score by 6.32%, 21.50%, and 25.40%, respectively. MAGNET also achieves the best performance among all the baseline methods in detecting Top-25 most dangerous Common Weakness Enumerations (CWEs), further demonstrating its effectiveness in vulnerability detection.
翻译:近年来,基于深度学习的方法已广泛应用于代码漏洞检测。这类方法通常从源代码中提取结构信息(例如代码结构图),并采用图神经网络等模型学习图表示。然而,这些方法未充分考虑代码结构图中的异构关系——即图中不同类型边连接不同类型节点的特性,这可能阻碍图表示学习。此外,由于代码结构图的深度层级,这些方法在捕获长距离依赖关系方面存在局限。本文提出一种基于元路径的注意力图学习模型MAGNET(用于代码漏洞检测)。MAGNET为每个代码片段构建多粒度元路径图,通过元路径表示异构关系以传递结构信息,并设计基于元路径的层次注意力图神经网络来捕获图中远距离节点间的关联。在三个公开数据集上的评估结果表明,MAGNET的F1分数分别比最优基线方法高出6.32%、21.50%和25.40%。此外,在检测Top-25最危险的常见弱点枚举(CWE)时,MAGNET在所有基线方法中表现最佳,进一步验证了其在漏洞检测中的有效性。