Eclectic Rule Extraction for Explainability of Deep Neural Network based Intrusion Detection Systems

This paper addresses trust issues created from the ubiquity of black box algorithms and surrogate explainers in Explainable Intrusion Detection Systems (X-IDS). While Explainable Artificial Intelligence (XAI) aims to enhance transparency, black box surrogate explainers, such as Local Interpretable Model-Agnostic Explanation (LIME) and SHapley Additive exPlanation (SHAP), are difficult to trust. The black box nature of these surrogate explainers makes the process behind explanation generation opaque and difficult to understand. To avoid this problem, one can use transparent white box algorithms such as Rule Extraction (RE). There are three types of RE algorithms: pedagogical, decompositional, and eclectic. Pedagogical methods offer fast but untrustworthy white-box explanations, while decompositional RE provides trustworthy explanations with poor scalability. This work explores eclectic rule extraction, which strikes a balance between scalability and trustworthiness. By combining techniques from pedagogical and decompositional approaches, eclectic rule extraction leverages the advantages of both, while mitigating some of their drawbacks. The proposed Hybrid X-IDS architecture features eclectic RE as a white box surrogate explainer for black box Deep Neural Networks (DNN). The presented eclectic RE algorithm extracts human-readable rules from hidden layers, facilitating explainable and trustworthy rulesets. Evaluations on UNSW-NB15 and CIC-IDS-2017 datasets demonstrate the algorithm's ability to generate rulesets with 99.9% accuracy, mimicking DNN outputs. The contributions of this work include the hybrid X-IDS architecture, the eclectic rule extraction algorithm applicable to intrusion detection datasets, and a thorough analysis of performance and explainability, demonstrating the trade-offs involved in rule extraction speed and accuracy.

翻译：本文针对可解释入侵检测系统(X-IDS)中黑盒算法与替代解释器普遍存在所引发的信任问题展开研究。尽管可解释人工智能(XAI)致力于提升透明度，但诸如局部可解释模型无关解释(LIME)和沙普利加性解释(SHAP)等黑盒替代解释器本身难以获得信任。这些替代解释器的黑盒特性使得解释生成过程不透明且难以理解。为规避该问题，可采用规则提取(RE)等透明的白盒算法。规则提取算法可分为三类：教学型、分解型和折衷型。教学型方法虽能快速生成白盒解释但可信度不足，而分解型RE虽能提供可信解释却扩展性较差。本研究探索了在可扩展性与可信度之间取得平衡的折衷规则提取方法。通过融合教学型与分解型方法的技术优势，折衷规则提取既能发挥两者优点，又能缓解其固有缺陷。提出的混合X-IDS架构采用折衷RE作为黑盒深度神经网络(DNN)的白盒替代解释器。本文提出的折衷RE算法能从隐藏层提取人类可读的规则，从而构建可解释且可信的规则集。在UNSW-NB15和CIC-IDS-2017数据集上的评估表明，该算法能以99.9%的准确率生成模拟DNN输出的规则集。本文贡献包括混合X-IDS架构、适用于入侵检测数据集的折衷规则提取算法，以及对性能与可解释性的深入分析，揭示了规则提取速度与准确性之间的权衡关系。