The network security analyzers use intrusion detection systems (IDSes) to distinguish malicious traffic from benign ones. The deep learning-based IDSes are proposed to auto-extract high-level features and eliminate the time-consuming and costly signature extraction process. However, this new generation of IDSes still suffers from a number of challenges. One of the main issues of an IDS is facing traffic concept drift which manifests itself as new (i.e., zero-day) attacks, in addition to the changing behavior of benign users/applications. Furthermore, a practical DL-based IDS needs to be conformed to a distributed architecture to handle big data challenges. We propose a framework for adapting DL-based models to the changing attack/benign traffic behaviors, considering a more practical scenario (i.e., online adaptable IDSes). This framework employs continual deep anomaly detectors in addition to the federated learning approach to solve the above-mentioned challenges. Furthermore, the proposed framework implements sequential packet labeling for each flow, which provides an attack probability score for the flow by gradually observing each flow packet and updating its estimation. We evaluate the proposed framework by employing different deep models (including CNN-based and LSTM-based) over the CIC-IDS2017 and CSE-CIC-IDS2018 datasets. Through extensive evaluations and experiments, we show that the proposed distributed framework is well adapted to the traffic concept drift. More precisely, our results indicate that the CNN-based models are well suited for continually adapting to the traffic concept drift (i.e., achieving an average detection rate of above 95% while needing just 128 new flows for the updating phase), and the LSTM-based models are a good candidate for sequential packet labeling in practical online IDSes (i.e., detecting intrusions by just observing their first 15 packets).
翻译:网络安全分析器利用入侵检测系统(IDSes)区分恶意流量与良性流量。基于深度学习的IDSes被提出用于自动提取高层特征,并消除耗时且昂贵的签名提取过程。然而,这一新一代IDSes仍面临诸多挑战。IDS的主要问题之一是面对流量概念漂移,其表现为新型(即零日)攻击,以及良性用户/应用行为的变化。此外,实用的基于深度学习的IDS需采用分布式架构以应对大数据挑战。我们提出一个框架,旨在使基于深度学习的模型适应不断变化的攻击/良性流量行为,并考虑更实际的场景(即在线自适应IDS)。该框架结合了持续深度异常检测器与联邦学习方法,以解决上述挑战。此外,所提框架实现了每个流的数据包序列标注,通过逐步观察每个流数据包并更新其估计,为流提供攻击概率评分。我们通过在CIC-IDS2017和CSE-CIC-IDS2018数据集上采用不同深度模型(包括基于CNN和基于LSTM的模型)来评估所提框架。通过广泛评估与实验,我们表明所提分布式框架能良好适应流量概念漂移。更确切地说,我们的结果表明:基于CNN的模型适合持续适应流量概念漂移(即实现超过95%的平均检测率,同时仅需128个新流用于更新阶段),而基于LSTM的模型是实用在线IDS中序列数据包标注的优选方案(即仅通过观察前15个数据包即可检测入侵)。