The problem of revealing botnet activity through Domain Generation Algorithm (DGA) detection seems to be solved, considering that available deep learning classifiers achieve accuracies of over 99.9%. However, these classifiers provide a false sense of security as they are heavily biased and allow for trivial detection bypass. In this work, we leverage explainable artificial intelligence (XAI) methods to analyze the reasoning of deep learning classifiers and to systematically reveal such biases. We show that eliminating these biases from DGA classifiers considerably deteriorates their performance. Nevertheless we are able to design a context-aware detection system that is free of the identified biases and maintains the detection rate of state-of-the art deep learning classifiers. In this context, we propose a visual analysis system that helps to better understand a classifier's reasoning, thereby increasing trust in and transparency of detection methods and facilitating decision-making.
翻译:通过域名生成算法(DGA)检测来揭示僵尸网络活动的问题似乎已得到解决,现有深度学习分类器的准确率超过99.9%。然而,这些分类器存在严重偏差,可轻易绕过检测,从而给人带来虚假的安全感。本研究利用可解释人工智能(XAI)方法分析深度学习分类器的推理逻辑,系统性地揭示此类偏差。我们证明,消除DGA分类器中的偏差会导致其性能显著下降。尽管如此,我们仍设计出一种无偏差的上下文感知检测系统,其检测率与当前最先进的深度学习分类器持平。在此框架下,我们提出一个可视化分析系统,可帮助更深入理解分类器的推理过程,从而提升检测方法的可信度与透明度,并支持决策制定。