siForest: Detecting Network Anomalies with Set-Structured Isolation Forest

As cyber threats continue to evolve in sophistication and scale, the ability to detect anomalous network behavior has become critical for maintaining robust cybersecurity defenses. Modern cybersecurity systems face the overwhelming challenge of analyzing billions of daily network interactions to identify potential threats, making efficient and accurate anomaly detection algorithms crucial for network defense. This paper investigates the use of variations of the Isolation Forest (iForest) machine learning algorithm for detecting anomalies in internet scan data. In particular, it presents the Set-Partitioned Isolation Forest (siForest), a novel extension of the iForest method designed to detect anomalies in set-structured data. By treating instances such as sets of multiple network scans with the same IP address as cohesive units, siForest effectively addresses some challenges of analyzing complex, multidimensional datasets. Extensive experiments on synthetic datasets simulating diverse anomaly scenarios in network traffic demonstrate that siForest has the potential to outperform traditional approaches on some types of internet scan data.

翻译：随着网络威胁在复杂性和规模上持续演进，检测异常网络行为的能力已成为维护强健网络安全防御的关键。现代网络安全系统面临分析每日数十亿网络交互以识别潜在威胁的巨大挑战，这使得高效且准确的异常检测算法对网络防御至关重要。本文研究了隔离森林（iForest）机器学习算法的变体在互联网扫描数据异常检测中的应用。特别地，本文提出了集合划分隔离森林（siForest）——一种专为检测集合结构数据中异常而设计的iForest方法新颖扩展。通过将具有相同IP地址的多次网络扫描等实例视为连贯单元进行处理，siForest有效解决了分析复杂多维数据集的部分挑战。在模拟网络流量中多样化异常场景的合成数据集上进行的大量实验表明，siForest在某些类型的互联网扫描数据上具有超越传统方法的潜力。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日