Intrusion detection systems perform post-compromise detection of security breaches whenever preventive measures such as firewalls do not avert an attack. However, these systems raise a vast number of alerts that must be analysed and triaged by security analysts. This process is largely manual, tedious and time-consuming. Alert correlation is a technique that tries to reduce the number of intrusion alerts by aggregating those that are related in some way. However, the correlation is performed outside the IDS through third-party systems and tools, after the high volume of alerts has already been raised. These other third-party systems add to the complexity of security operations. In this paper, we build on the very researched area of correlation techniques by developing a novel hierarchical event correlation model that promises to reduce the number of alerts issued by an Intrusion Detection System. This is achieved by correlating the events before the IDS classifies them. The proposed model takes the best of features from similarity and graph-based correlation techniques to deliver an ensemble capability not possible by either approach separately. Further, we propose a correlation process for correlation of events rather than alerts as is the case in current art. We further develop our own correlation and clustering algorithm which is tailor-made to the correlation and clustering of network event data. The model is implemented as a proof of concept with experiments run on the DARPA 99 Intrusion detection set. The correlation achieved 87 percent data reduction through aggregation, producing nearly 21000 clusters in about 30 seconds.
翻译:入侵检测系统在防火墙等预防措施未能抵御攻击时,执行安全漏洞的事后检测。然而,这些系统会产生海量告警,需要安全分析师进行解析和分类处理。该过程主要依赖人工操作,繁琐且耗时。告警关联是一种通过聚合存在某种关联的告警来降低入侵告警数量的技术。但目前关联操作是在入侵检测系统之外,通过第三方系统工具完成的,此时大量告警已经产生。这些第三方系统进一步增加了安全运营的复杂性。本文在已有丰富研究的关联技术基础上,提出了一种新型分层事件关联模型,通过将事件在入侵检测系统分类前进行关联,有望减少系统发出的告警数量。该模型融合了基于相似性和基于图的关联技术的优势特征,实现了单独采用任一方法都无法获得的集成能力。此外,我们提出了一种针对事件而非当前技术中告警的关联过程,并专门针对网络事件数据开发了定制化的关联与聚类算法。该模型作为概念验证实现,在DARPA 99入侵检测数据集上进行了实验。通过聚合实现了87%的数据缩减,在约30秒内生成了近21000个聚类。