Stepping-stone intrusions (SSIs) are a prevalent network evasion technique in which attackers route sessions through chains of compromised intermediate hosts to obscure their origin. Effective SSI detection requires correlating the incoming and outgoing flows at each relay host at extremely low false positive rates -- a stringent requirement that renders classical statistical methods inadequate in operational settings. We apply ESPRESSO, a deep learning flow correlation model combining a transformer-based feature extraction network, time-aligned multi-channel interval features, and online triplet metric learning, to the problem of stepping-stone intrusion detection. To support training and evaluation, we develop a synthetic data collection tool that generates realistic stepping-stone traffic across five tunneling protocols: SSH, SOCAT, ICMP, DNS, and mixed multi-protocol chains. Across all five protocols and in both host-mode and network-mode detection scenarios, ESPRESSO substantially outperforms the state-of-the-art DeepCoFFEA baseline, achieving a true positive rate exceeding 0.99 at a false positive rate of $10^{-3}$ for standard bursty protocols in network-mode. We further demonstrate chain length prediction as a tool for distinguishing malicious from benign pivoting, and conduct a systematic robustness analysis revealing that timing-based perturbations are the primary vulnerability of correlation-based stepping-stone detectors.
翻译:跳板入侵(SSIs)是一种普遍的网络规避技术,攻击者通过将会话路由经过多个被攻陷的中介主机链来隐藏其来源。有效的SSI检测需要在每个中继主机上以极低的误报率关联入站和出站流量——这一严格要求使得经典统计方法在实际运维环境中难以胜任。我们将ESPRESSO(一种结合基于Transformer的特征提取网络、时间对齐多通道间隔特征及在线三元组度量学习的深度学习流关联模型)应用于跳板入侵检测问题。为支持训练与评估,我们开发了一个合成数据收集工具,能够生成五种隧道协议(SSH、SOCAT、ICMP、DNS及多协议混合链)下的真实跳板流量。在所有五种协议下,无论是主机模式还是网络模式检测场景,ESPRESSO均显著优于当前最先进的DeepCoFFEA基线:在标准突发协议的网络模式中,当误报率为$10^{-3}$时,其真正率超过0.99。我们进一步展示了将链长预测作为区分恶意与良性跳板的手段,并进行了系统的鲁棒性分析,揭示了基于时间的扰动是基于关联的跳板检测器的主要脆弱点。