Intrusion detection systems perform post-compromise detection of security breaches whenever preventive measures such as firewalls do not avert an attack. However, these systems raise a vast number of alerts that must be analysed and triaged by security analysts. This process is largely manual, tedious and time-consuming. Alert correlation is a technique that tries to reduce the number of intrusion alerts by aggregating those that are related in some way. However, the correlation is performed outside the IDS through third-party systems and tools, after the high volume of alerts has already been raised. These other third-party systems add to the complexity of security operations. In this paper, we build on the very researched area of correlation techniques by developing a novel hierarchical event correlation model that promises to reduce the number of alerts issued by an Intrusion Detection System. This is achieved by correlating the events before the IDS classifies them. The proposed model takes the best of features from similarity and graph-based correlation techniques to deliver an ensemble capability not possible by either approach separately. Further, we propose a correlation process for correlation of events rather than alerts as is the case in current art. We further develop our own correlation and clustering algorithm which is tailor-made to the correlation and clustering of network event data. The model is implemented as a proof of concept with experiments run on the DARPA 99 Intrusion detection set. The correlation achieved 87 percent data reduction through aggregation, producing nearly 21000 clusters in about 30 seconds.
翻译:入侵检测系统在防火墙等预防措施未能阻止攻击时,执行攻击后的安全漏洞检测。然而,这些系统会产生大量告警,亟需安全分析师进行分析与分类处理。该流程主要依赖人工操作,繁琐耗时。告警关联技术通过聚合具有特定关联性的入侵告警来减少告警数量。但现有关联操作通常在入侵检测系统外部通过第三方系统与工具完成,此时大量告警已产生。这些第三方系统进一步增加了安全运维的复杂性。本文在已有深厚研究基础的关联技术领域进行拓展,提出一种新型分层事件关联模型,该模型有望减少入侵检测系统产生的告警数量。其实现方式是在入侵检测系统对事件进行分类前完成事件关联。该模型融合了基于相似度与基于图的关联技术的最优特性,实现了两种方法单独使用时无法达到的集成能力。此外,本文提出一种针对事件(而非现有技术中的告警)的关联流程,并自主研发了专用于网络事件数据关联与聚类的算法。通过概念验证实现该模型,并在DARPA 99入侵检测数据集上进行实验。实验表明,该模型通过聚合实现了87%的数据缩减,约30秒内生成近21000个聚类簇。