Deep learning-based fine-grained network intrusion detection systems (NIDS) enable different attacks to be responded to in a fast and targeted manner with the help of large-scale labels. However, the cost of labeling causes insufficient labeled samples. Also, the real fine-grained traffic shows a long-tailed distribution with great class imbalance. These two problems often appear simultaneously, posing serious challenges to fine-grained NIDS. In this work, we propose a novel semi-supervised fine-grained intrusion detection framework, SF-IDS, to achieve attack classification in the label-limited and highly class imbalanced case. We design a self-training backbone model called RI-1DCNN to boost the feature extraction by reconstructing the input samples into a multichannel image format. The uncertainty of the generated pseudo-labels is evaluated and used as a reference for pseudo-label filtering in combination with the prediction probability. To mitigate the effects of fine-grained class imbalance, we propose a hybrid loss function combining supervised contrastive loss and multi-weighted classification loss to obtain more compact intra-class features and clearer inter-class intervals. Experiments show that the proposed SF-IDS achieves 3.01% and 2.71% Marco-F1 improvement on two classical datasets with 1% labeled, respectively.
翻译:基于深度学习的细粒度网络入侵检测系统(NIDS)借助大规模标签可对不同攻击进行快速且有针对性的响应。然而,标注成本导致标注样本不足。同时,真实细粒度流量呈现长尾分布且存在严重类别不平衡。这两个问题常同时出现,对细粒度NIDS构成严峻挑战。本文提出一种新型半监督细粒度入侵检测框架SF-IDS,以在标签受限且高度类别不平衡情况下实现攻击分类。我们设计了名为RI-1DCNN的自训练骨干模型,通过将输入样本重构为多通道图像格式来增强特征提取。生成伪标签的不确定性被评估,并与预测概率结合作为伪标签过滤的参考。为缓解细粒度类别不平衡的影响,我们提出一种结合监督对比损失与多权重分类损失的混合损失函数,以获得更紧凑的类内特征和更清晰的类间间隔。实验表明,在1%标注样本条件下,所提SF-IDS在两个经典数据集上分别实现了3.01%和2.71%的Macro-F1提升。