Network anomaly detection is a very relevant research area nowadays, especially due to its multiple applications in the field of network security. The boost of new models based on variational autoencoders and generative adversarial networks has motivated a reevaluation of traditional techniques for anomaly detection. It is, however, essential to be able to understand these new models from the perspective of the experience attained from years of evaluating network security data for anomaly detection. In this paper, we revisit anomaly detection techniques based on PCA from a probabilistic generative model point of view, and contribute a mathematical model that relates them. Specifically, we start with the probabilistic PCA model and explain its connection to the Multivariate Statistical Network Monitoring (MSNM) framework. MSNM was recently successfully proposed as a means of incorporating industrial process anomaly detection experience into the field of networking. We have evaluated the mathematical model using two different datasets. The first, a synthetic dataset created to better understand the analysis proposed, and the second, UGR'16, is a specifically designed real-traffic dataset for network security anomaly detection. We have drawn conclusions that we consider to be useful when applying generative models to network security detection.
翻译:网络异常检测是当前一个非常重要的研究领域,特别是在网络安全方面的多种应用。基于变分自编码器和生成对抗网络的新模型的兴起,促使人们对传统的异常检测技术进行重新评估。然而,从多年来评估网络安全数据以进行异常检测所获得的经验角度来理解这些新模型至关重要。在本文中,我们从概率生成模型的角度重新审视基于PCA的异常检测技术,并贡献了一个将它们联系起来的数学模型。具体来说,我们从概率PCA模型开始,解释其与多元统计网络监控(MSNM)框架的关联。MSNM最近被成功提出,作为将工业过程异常检测经验引入网络领域的一种手段。我们使用两个不同的数据集评估了该数学模型。第一个是合成数据集,旨在更好地理解所提出的分析;第二个是UGR'16,这是一个专门为网络安全异常检测设计的真实流量数据集。我们得出的结论,对于将生成模型应用于网络安全检测具有实用价值。