Applying Self-supervised Learning to Network Intrusion Detection for Network Flows with Graph Neural Network

Graph Neural Networks (GNNs) have garnered intensive attention for Network Intrusion Detection System (NIDS) due to their suitability for representing the network traffic flows. However, most present GNN-based methods for NIDS are supervised or semi-supervised. Network flows need to be manually annotated as supervisory labels, a process that is time-consuming or even impossible, making NIDS difficult to adapt to potentially complex attacks, especially in large-scale real-world scenarios. The existing GNN-based self-supervised methods focus on the binary classification of network flow as benign or not, and thus fail to reveal the types of attack in practice. This paper studies the application of GNNs to identify the specific types of network flows in an unsupervised manner. We first design an encoder to obtain graph embedding, that introduces the graph attention mechanism and considers the edge information as the only essential factor. Then, a self-supervised method based on graph contrastive learning is proposed. The method samples center nodes, and for each center node, generates subgraph by it and its direct neighbor nodes, and corresponding contrastive subgraph from the interpolated graph, and finally constructs positive and negative samples from subgraphs. Furthermore, a structured contrastive loss function based on edge features and graph local topology is introduced. To the best of our knowledge, it is the first GNN-based self-supervised method for the multiclass classification of network flows in NIDS. Detailed experiments conducted on four real-world databases (NF-Bot-IoT, NF-Bot-IoT-v2, NF-CSE-CIC-IDS2018, and NF-CSE-CIC-IDS2018-v2) systematically compare our model with the state-of-the-art supervised and self-supervised models, illustrating the considerable potential of our method. Our code is accessible through https://github.com/renj-xu/NEGSC.

翻译：图神经网络因其对网络流量数据流的天然适配性，在网络入侵检测系统中受到广泛关注。然而，当前多数基于图神经网络的入侵检测方法采用监督或半监督学习范式。网络流需要人工标注监督标签，这一过程耗时甚至不可行，使得入侵检测系统难以适应潜在复杂攻击，尤其在真实大规模场景中。现有基于图神经网络的自监督方法仅对网络流进行良性与否的二元分类，未能揭示实际攻击类型。本文研究如何通过无监督方式利用图神经网络识别网络流的具体类型。首先设计编码器获取图嵌入，该编码器引入图注意力机制，仅将边信息作为核心要素。随后提出基于图对比学习的自监督方法：该方法采样中心节点，为每个中心节点构建由该节点及其直接邻居节点组成的子图，并从插值图中生成相应的对比子图，最终通过子图构建正负样本。进一步设计基于边特征和局部图拓扑结构的结构化对比损失函数。据我们所知，这是首个针对网络入侵检测系统网络流多分类任务的图神经网络自监督方法。在四个真实数据库上的实验系统性地将本模型与当前最优的监督及自监督模型进行对比，结果表明本方法具有显著潜力。代码已开源至https://github.com/renj-xu/NEGSC。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日