Machine learning and neural networks have become increasingly popular solutions for encrypted malware traffic detection. They mine and learn complex traffic patterns, enabling detection by fitting boundaries between malware traffic and benign traffic. Compared with signature-based methods, they have higher scalability and flexibility. However, affected by the frequent variants and updates of malware, current methods suffer from a high false positive rate and do not work well for unknown malware traffic detection. It remains a critical task to achieve effective malware traffic detection. In this paper, we introduce CBSeq to address the above problems. CBSeq is a method that constructs a stable traffic representation, behavior sequence, to characterize attacking intent and achieve malware traffic detection. We novelly propose the channels with similar behavior as the detection object and extract side-channel content to construct behavior sequence. Unlike benign activities, the behavior sequences of malware and its variant's traffic exhibit solid internal correlations. Moreover, we design the MSFormer, a powerful Transformer-based multi-sequence fusion classifier. It captures the internal similarity of behavior sequence, thereby distinguishing malware traffic from benign traffic. Our evaluations demonstrate that CBSeq performs effectively in various known malware traffic detection and exhibits superior performance in unknown malware traffic detection, outperforming state-of-the-art methods.
翻译:机器学习和神经网络已成为加密恶意流量检测的主流解决方案。此类方法通过挖掘并学习复杂流量模式,拟合恶意流量与良性流量的边界实现检测。相较于基于签名的检测方法,其具备更强的可扩展性与灵活性。然而受恶意软件频繁变种与更新的影响,现有方法存在较高误报率,且对未知恶意流量检测效果欠佳。实现有效的恶意流量检测仍是关键挑战。本文引入CBSeq以解决上述问题。该方法通过构建稳定流量表征——行为序列,刻画攻击意图并实现恶意流量检测。我们创新性地提出以行为相似信道作为检测对象,通过提取侧信道内容构建行为序列。与良性活动不同,恶意软件及其变种流量的行为序列呈现强内相关性。进而设计MSFormer——一种基于Transformer的强大多序列融合分类器,通过捕获行为序列的内部相似性区分恶意流量与良性流量。评估表明,CBSeq在多种已知恶意流量检测场景中表现有效,且在未知恶意流量检测中展现出超越现有最优方法的卓越性能。