In this paper, we address the critical need for interpretable and uncertainty-aware machine learning models in the context of online learning for high-risk industries, particularly cyber-security. While deep learning and other complex models have demonstrated impressive predictive capabilities, their opacity and lack of uncertainty quantification present significant questions about their trustworthiness. We propose a novel pipeline for online supervised learning problems in cyber-security, that harnesses the inherent interpretability and uncertainty awareness of Additive Gaussian Processes (AGPs) models. Our approach aims to balance predictive performance with transparency while improving the scalability of AGPs, which represents their main drawback, potentially enabling security analysts to better validate threat detection, troubleshoot and reduce false positives, and generally make trustworthy, informed decisions. This work contributes to the growing field of interpretable AI by proposing a class of models that can be significantly beneficial for high-stake decision problems such as the ones typical of the cyber-security domain. The source code is available.
翻译:本文针对高风险行业(特别是网络安全领域)在线学习中对可解释性和不确定性感知机器学习模型的迫切需求进行了探讨。尽管深度学习及其他复杂模型已展现出卓越的预测能力,但其不透明性和不确定性量化机制的缺失,对其可信度构成了重大挑战。我们提出了一种适用于网络安全在线监督学习问题的新型流程,该流程利用可加性高斯过程模型固有的可解释性和不确定性感知特性。我们的方法旨在平衡预测性能与模型透明度,同时改进AGPs的可扩展性(这是其主要缺陷),从而使安全分析师能够更好地验证威胁检测、排查故障并减少误报,最终做出可靠且基于充分信息的决策。本研究通过提出一类对高风险决策问题(如网络安全领域的典型问题)具有显著益处的模型,为可解释人工智能这一不断发展的领域做出了贡献。相关源代码已公开。