Machine learning has achieved state-of-the-art results in network intrusion detection; however, its performance significantly degrades when confronted by a new attack class -- a zero-day attack. In simple terms, classical machine learning-based approaches are adept at identifying attack classes on which they have been previously trained, but struggle with those not included in their training data. One approach to addressing this shortcoming is to utilise anomaly detectors which train exclusively on benign data with the goal of generalising to all attack classes -- both known and zero-day. However, this comes at the expense of a prohibitively high false positive rate. This work proposes a novel contrastive loss function which is able to maintain the advantages of other contrastive learning-based approaches (robustness to imbalanced data) but can also generalise to zero-day attacks. Unlike anomaly detectors, this model learns the distributions of benign traffic using both benign and known malign samples, i.e. other well-known attack classes (not including the zero-day class), and consequently, achieves significant performance improvements. The proposed approach is experimentally verified on the Lycos2017 dataset where it achieves an AUROC improvement of .000065 and .060883 over previous models in known and zero-day attack detection, respectively. Finally, the proposed method is extended to open-set recognition achieving OpenAUC improvements of .170883 over existing approaches.
翻译:机器学习在网络入侵检测领域已取得最先进的成果;然而,当面对新的攻击类别——零日攻击时,其性能会显著下降。简而言之,基于经典机器学习的方法擅长识别先前训练过的攻击类别,但对于训练数据中未包含的攻击类别则表现不佳。解决这一缺陷的一种方法是利用异常检测器,这类检测器仅使用良性数据进行训练,旨在泛化到所有攻击类别——包括已知和零日攻击。然而,这会导致误报率过高,难以实际应用。本文提出了一种新颖的对比损失函数,该函数能够保持其他基于对比学习方法的优势(对不平衡数据的鲁棒性),同时还能泛化到零日攻击。与异常检测器不同,该模型使用良性样本和已知恶意样本(即其他已知攻击类别,不包括零日攻击类别)来学习良性流量的分布,从而实现了显著的性能提升。所提方法在Lycos2017数据集上进行了实验验证,在已知攻击和零日攻击检测方面,其AUROC分别比先前模型提高了0.000065和0.060883。最后,该方法被扩展到开放集识别任务,其OpenAUC比现有方法提高了0.170883。