In recent years, there has been a growing interest in using Machine Learning (ML), especially Deep Learning (DL) to solve Network Intrusion Detection (NID) problems. However, the feature distribution shift problem remains a difficulty, because the change in features' distributions over time negatively impacts the model's performance. As one promising solution, model pretraining has emerged as a novel training paradigm, which brings robustness against feature distribution shift and has proven to be successful in Computer Vision (CV) and Natural Language Processing (NLP). To verify whether this paradigm is beneficial for NID problem, we propose SwapCon, a ML model in the context of NID, which compresses shift-invariant feature information during the pretraining stage and refines during the finetuning stage. We exemplify the evidence of feature distribution shift using the Kyoto2006+ dataset. We demonstrate how pretraining a model with the proper size can increase robustness against feature distribution shifts by over 8%. Moreover, we show how an adequate numerical embedding strategy also enhances the performance of pretrained models. Further experiments show that the proposed SwapCon model also outperforms eXtreme Gradient Boosting (XGBoost) and K-Nearest Neighbor (KNN) based models by a large margin.
翻译:近年来,利用机器学习(特别是深度学习)解决网络入侵检测(NID)问题引起了广泛关注。然而,特征分布偏移问题仍是难题,因为特征分布随时间的变化会负面影响模型性能。模型预训练作为一种新兴训练范式,通过对抗特征分布偏移展现出鲁棒性,已在计算机视觉(CV)和自然语言处理(NLP)领域取得显著成功。为验证该范式对NID问题是否有效,我们提出SwapCon——一种面向NID的机器学习模型,该模型在预训练阶段压缩位移不变特征信息,并在微调阶段进行精细调整。我们利用Kyoto2006+数据集论证了特征分布偏移的存在,展示适当规模的预训练模型如何使特征分布偏移鲁棒性提升超过8%。此外,我们证明合理的数值嵌入策略也能增强预训练模型的性能。进一步实验表明,所提出的SwapCon模型在性能上大幅优于基于极端梯度提升(XGBoost)和K近邻(KNN)的模型。