A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection

There has been a large number of studies in interpretable and explainable ML for cybersecurity, in particular, for intrusion detection. Many of these studies have significant amount of overlapping and repeated evaluations and analysis. At the same time, these studies overlook crucial model, data, learning process, and utility related issues and many times completely disregard them. These issues include the use of overly complex and opaque ML models, unaccounted data imbalances and correlated features, inconsistent influential features across different explanation methods, the inconsistencies stemming from the constituents of a learning process, and the implausible utility of explanations. In this work, we empirically demonstrate these issues, analyze them and propose practical solutions in the context of feature-based model explanations. Specifically, we advise avoiding complex opaque models such as Deep Neural Networks and instead using interpretable ML models such as Decision Trees as the available intrusion datasets are not difficult for such interpretable models to classify successfully. Then, we bring attention to the binary classification metrics such as Matthews Correlation Coefficient (which are well-suited for imbalanced datasets. Moreover, we find that feature-based model explanations are most often inconsistent across different settings. In this respect, to further gauge the extent of inconsistencies, we introduce the notion of cross explanations which corroborates that the features that are determined to be impactful by one explanation method most often differ from those by another method. Furthermore, we show that strongly correlated data features and the constituents of a learning process, such as hyper-parameters and the optimization routine, become yet another source of inconsistent explanations. Finally, we discuss the utility of feature-based explanations.

翻译：在网络安全领域，尤其是入侵检测方面，已有大量关于可解释与可说明机器学习的研究。其中许多研究存在大量重叠和重复的评估与分析。与此同时，这些研究往往忽略了关键的模型、数据、学习过程及效用相关问题，甚至时常完全忽视这些问题。这些问题包括：使用过于复杂且不透明的机器学习模型、未考虑的数据不平衡与相关特征、不同解释方法间不一致的重要特征、学习过程各组成部分所导致的不一致性，以及解释本身的效用不足。在本工作中，我们通过实证展示了这些问题，对其进行了分析，并在基于特征的模型解释背景下提出了实用解决方案。具体而言，我们建议避免使用深度神经网络等复杂不透明模型，转而采用决策树等可解释机器学习模型，因为现有入侵数据集对于此类可解释模型而言并不难以成功分类。其次，我们提请关注马修斯相关系数等适用于不平衡数据集的二分类评估指标。此外，我们发现基于特征的模型解释在不同设置下经常出现不一致。为此，为深入评估不一致性的程度，我们引入了交叉解释的概念，证实了由一种解释方法确定的重要特征往往与另一种方法得出的特征不同。进一步地，我们证明了高度相关的数据特征以及学习过程的构成要素（如超参数和优化例程）也会成为解释不一致的来源。最后，我们探讨了基于特征的解释的效用问题。