基于深度自编码器的UEBA框架的网络安全威胁检测 (Cybersecurity threat detection based on a UEBA framework using Deep Autoencoders)

User and Entity Behaviour Analytics (UEBA) is a broad branch of data analytics that attempts to build a normal behavioural profile in order to detect anomalous events. Among the techniques used to detect anomalies, Deep Autoencoders constitute one of the most promising deep learning models on UEBA tasks, allowing explainable detection of security incidents that could lead to the leak of personal data, hijacking of systems, or access to sensitive business information. In this study, we introduce the first implementation of an explainable UEBA-based anomaly detection framework that leverages Deep Autoencoders in combination with Doc2Vec to process both numerical and textual features. Additionally, based on the theoretical foundations of neural networks, we offer a novel proof demonstrating the equivalence of two widely used definitions for fully-connected neural networks. The experimental results demonstrate the proposed framework capability to detect real and synthetic anomalies effectively generated from real attack data, showing that the models provide not only correct identification of anomalies but also explainable results that enable the reconstruction of the possible origin of the anomaly. Our findings suggest that the proposed UEBA framework can be seamlessly integrated into enterprise environments, complementing existing security systems for explainable threat detection.

翻译：用户与实体行为分析（UEBA）是数据分析的一个广泛分支，旨在构建正常行为画像以检测异常事件。在用于检测异常的技术中，深度自编码器是UEBA任务中最有前景的深度学习模型之一，能够对可能导致个人数据泄露、系统劫持或敏感商业信息访问的安全事件进行可解释的检测。在本研究中，我们首次实现了一个基于UEBA的可解释异常检测框架，该框架利用深度自编码器结合Doc2Vec处理数值和文本特征。此外，基于神经网络的理论基础，我们提出了一种新颖的证明，展示了两种广泛使用的全连接神经网络定义的等价性。实验结果表明，所提出的框架能够有效检测由真实攻击数据生成的真实和合成异常，表明模型不仅能正确识别异常，还能提供可解释的结果，从而重建异常的可能来源。我们的研究结果表明，所提出的UEBA框架可以无缝集成到企业环境中，补充现有安全系统以实现可解释的威胁检测。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日