Intrusion detection systems (IDS) reinforce cyber defense by autonomously monitoring various data sources for traces of attacks. However, IDSs are also infamous for frequently raising false positives and alerts that are difficult to interpret without context. This results in high workloads on security operators who need to manually verify all reported alerts, often leading to fatigue and incorrect decisions. To generate more meaningful alerts and alleviate these issues, the research domain focused on multi-step attack analysis proposes approaches for filtering, clustering, and correlating IDS alerts, as well as generation of attack graphs. Unfortunately, existing data sets are outdated, unreliable, narrowly focused, or only suitable for IDS evaluation. Since hardly any suitable benchmark data sets are publicly available, researchers often resort to private data sets that prevent reproducibility of evaluations. We therefore generate a new alert data set that we publish alongside this paper. The data set contains alerts from three distinct IDSs monitoring eight executions of a multi-step attack as well as simulations of normal user behavior. To illustrate the potential of our data set, we experiment with alert prioritization as well as two open-source tools for meta-alert generation and attack graph extraction.
翻译:入侵检测系统(IDS)通过自主监控多种数据源中的攻击痕迹来强化网络防御。然而,IDS也因频繁产生误报和缺乏上下文难以解读的警报而饱受诟病。这导致安全操作员需要手动验证所有报告的警报,工作负荷沉重,往往引发疲劳和错误决策。为生成更有意义的警报并缓解上述问题,聚焦于多步攻击分析的研究领域提出了针对IDS警报的过滤、聚类和关联方法,以及攻击图的生成技术。遗憾的是,现有数据集要么过时、不可靠、关注范围狭窄,要么仅适用于IDS评估。由于公开可用的合适基准数据集极为稀缺,研究者常依赖私有数据集,这阻碍了评估结果的可复现性。因此,我们生成了一个全新的警报数据集,随本文一并发布。该数据集包含来自三种不同IDS在八次多步攻击执行过程中的警报记录,以及正常用户行为的模拟数据。为展示该数据集的潜力,我们进行了警报优先级排序实验,并采用两种开源工具开展了元警报生成与攻击图提取的测试。