The problem of revealing botnet activity through Domain Generation Algorithm (DGA) detection seems to be solved, considering that available deep learning classifiers achieve accuracies of over 99.9%. However, these classifiers provide a false sense of security as they are heavily biased and allow for trivial detection bypass. In this work, we leverage explainable artificial intelligence (XAI) methods to analyze the reasoning of deep learning classifiers and to systematically reveal such biases. We show that eliminating these biases from DGA classifiers considerably deteriorates their performance. Nevertheless we are able to design a context-aware detection system that is free of the identified biases and maintains the detection rate of state-of-the art deep learning classifiers. In this context, we propose a visual analysis system that helps to better understand a classifier's reasoning, thereby increasing trust in and transparency of detection methods and facilitating decision-making.
翻译:通过域名生成算法(DGA)检测来揭露僵尸网络活动的问题似乎已得到解决,现有深度学习分类器的准确率超过99.9%。然而,这些分类器由于存在严重偏差,会给人一种虚假的安全感,并可被轻易绕过检测。本文利用可解释人工智能(XAI)方法分析深度学习分类器的推理过程,系统性地揭示此类偏差。研究表明,消除DGA分类器中的这些偏差会显著降低其性能。尽管如此,我们仍成功设计出一种无偏差的上下文感知检测系统,其检测率可达到当前最先进的深度学习分类器的水平。在此基础上,我们提出一套可视化分析系统,助力更深入地理解分类器的推理过程,从而提升检测方法的透明度与可信度,并辅助决策制定。