An Interpretable Generalization Mechanism for Accurately Detecting Anomaly and Identifying Networking Intrusion Techniques

Recent advancements in Intrusion Detection Systems (IDS), integrating Explainable AI (XAI) methodologies, have led to notable improvements in system performance via precise feature selection. However, a thorough understanding of cyber-attacks requires inherently explainable decision-making processes within IDS. In this paper, we present the Interpretable Generalization Mechanism (IG), poised to revolutionize IDS capabilities. IG discerns coherent patterns, making it interpretable in distinguishing between normal and anomalous network traffic. Further, the synthesis of coherent patterns sheds light on intricate intrusion pathways, providing essential insights for cybersecurity forensics. By experiments with real-world datasets NSL-KDD, UNSW-NB15, and UKM-IDS20, IG is accurate even at a low ratio of training-to-test. With 10%-to-90%, IG achieves Precision (PRE)=0.93, Recall (REC)=0.94, and Area Under Curve (AUC)=0.94 in NSL-KDD; PRE=0.98, REC=0.99, and AUC=0.99 in UNSW-NB15; and PRE=0.98, REC=0.98, and AUC=0.99 in UKM-IDS20. Notably, in UNSW-NB15, IG achieves REC=1.0 and at least PRE=0.98 since 40%-to-60%; in UKM-IDS20, IG achieves REC=1.0 and at least PRE=0.88 since 20%-to-80%. Importantly, in UKM-IDS20, IG successfully identifies all three anomalous instances without prior exposure, demonstrating its generalization capabilities. These results and inferences are reproducible. In sum, IG showcases superior generalization by consistently performing well across diverse datasets and training-to-test ratios (from 10%-to-90% to 90%-to-10%), and excels in identifying novel anomalies without prior exposure. Its interpretability is enhanced by coherent evidence that accurately distinguishes both normal and anomalous activities, significantly improving detection accuracy and reducing false alarms, thereby strengthening IDS reliability and trustworthiness.

翻译：近期，入侵检测系统（IDS）通过整合可解释人工智能（XAI）方法，借助精确的特征选择实现了系统性能的显著提升。然而，要深入理解网络攻击，需要IDS内部具备本质可解释的决策过程。本文提出了一种可解释泛化机制（IG），有望彻底革新IDS的能力。IG能够识别连贯模式，使其在区分正常与异常网络流量时具备可解释性。此外，对连贯模式的综合揭示了复杂的入侵路径，为网络安全取证提供了关键洞见。通过在真实数据集NSL-KDD、UNSW-NB15和UKM-IDS20上的实验，IG即使在较低的训练-测试比例下仍能保持高精度。在10%-90%的比例下，IG在NSL-KDD上实现了精确率（PRE）=0.93、召回率（REC）=0.94和曲线下面积（AUC）=0.94；在UNSW-NB15上实现了PRE=0.98、REC=0.99和AUC=0.99；在UKM-IDS20上实现了PRE=0.98、REC=0.98和AUC=0.99。值得注意的是，在UNSW-NB15中，自40%-60%比例起，IG实现了REC=1.0且至少PRE=0.98；在UKM-IDS20中，自20%-80%比例起，IG实现了REC=1.0且至少PRE=0.88。重要的是，在UKM-IDS20中，IG成功识别了所有三个先前未接触过的异常实例，证明了其泛化能力。这些结果与推论均可复现。总之，IG通过在不同数据集和训练-测试比例（从10%-90%到90%-10%）下持续表现优异，展示了卓越的泛化性能，并在识别先前未接触过的新型异常方面表现突出。其可解释性通过能够准确区分正常与异常活动的连贯证据得到增强，显著提高了检测精度并降低了误报，从而增强了IDS的可靠性与可信度。