Examining the Rat in the Tunnel: Interpretable Multi-Label Classification of Tor-based Malware

Despite being the most popular privacy-enhancing network, Tor is increasingly adopted by cybercriminals to obfuscate malicious traffic, hindering the identification of malware-related communications between compromised devices and Command and Control (C&C) servers. This malicious traffic can induce congestion and reduce Tor's performance, while encouraging network administrators to block Tor traffic. Recent research, however, demonstrates the potential for accurately classifying captured Tor traffic as malicious or benign. While existing efforts have addressed malware class identification, their performance remains limited, with micro-average precision and recall values around 70%. Accurately classifying specific malware classes is crucial for effective attack prevention and mitigation. Furthermore, understanding the unique patterns and attack vectors employed by different malware classes helps the development of robust and adaptable defence mechanisms. We utilise a multi-label classification technique based on Message-Passing Neural Networks, demonstrating its superiority over previous approaches such as Binary Relevance, Classifier Chains, and Label Powerset, by achieving micro-average precision (MAP) and recall (MAR) exceeding 90%. Compared to previous work, we significantly improve performance by 19.98%, 10.15%, and 59.21% in MAP, MAR, and Hamming Loss, respectively. Next, we employ Explainable Artificial Intelligence (XAI) techniques to interpret the decision-making process within these models. Finally, we assess the robustness of all techniques by crafting adversarial perturbations capable of manipulating classifier predictions and generating false positives and negatives.

翻译：尽管Tor是最受欢迎的隐私增强网络，但其正日益被网络犯罪分子用于混淆恶意流量，从而阻碍对受感染设备与命令控制（C&C）服务器间恶意软件相关通信的识别。此类恶意流量可能导致网络拥塞并降低Tor性能，同时促使网络管理员屏蔽Tor流量。然而，近期研究表明，对捕获的Tor流量进行恶意与良性分类具有可行性。现有研究虽已涉及恶意软件类别识别，但其性能仍存在局限，微平均精确率与召回率仅维持在70%左右。准确分类特定恶意软件类别对于有效预防和缓解攻击至关重要。此外，理解不同恶意软件类别所采用的特有模式与攻击向量，有助于开发稳健且适应性强的防御机制。本研究采用基于消息传递神经网络的多标签分类技术，通过实现超过90%的微平均精确率（MAP）与召回率（MAR），证明了该方法相较于先前技术（如二元关联、分类器链和标签幂集）的优越性。与既有工作相比，我们在MAP、MAR和汉明损失三项指标上分别显著提升了19.98%、10.15%和59.21%。进一步地，我们运用可解释人工智能技术阐释模型内部的决策过程。最后，通过构建能够操纵分类器预测并产生误报与漏报的对抗性扰动，我们评估了所有技术方案的鲁棒性。