In response to Distributed Denial of Service (DDoS) attacks, recent research efforts increasingly rely on Machine Learning (ML)-based solutions, whose effectiveness largely depends on the quality of labeled training datasets. To address the scarcity of such datasets, data augmentation with synthetic traces is often employed. However, current synthetic trace generation methods struggle to capture the complex temporal patterns and spatial distributions exhibited in emerging DDoS attacks. This results in insufficient resemblance to real traces and unsatisfied detection accuracy when applied to ML tasks. In this paper, we propose Dual-Stream Temporal-Field Diffusion (DSTF-Diffusion), a multi-view, multi-stream network traffic generative model based on diffusion models, featuring two main streams: The field stream utilizes spatial mapping to bridge network data characteristics with pre-trained realms of stable diffusion models, effectively translating complex network interactions into formats that stable diffusion can process, while the spatial stream adopts a dynamic temporal modeling approach, meticulously capturing the intrinsic temporal patterns of network traffic. Extensive experiments demonstrate that data generated by our model exhibits higher statistical similarity to originals compared to current state-of-the-art solutions, and enhance performances on a wide range of downstream tasks.
翻译:针对分布式拒绝服务(DDoS)攻击,近期的研究工作日益依赖基于机器学习(ML)的解决方案,其有效性在很大程度上取决于标记训练数据集的质量。为解决此类数据集稀缺问题,常采用基于合成迹的数据增强方法。然而,当前合成迹生成方法难以捕捉新兴DDoS攻击中呈现的复杂时序模式与空间分布,导致生成数据与真实迹的相似度不足,应用于ML任务时检测精度不佳。本文提出基于扩散模型的多视角、多流网络流量生成模型——双流时序-字段扩散(DSTF-Diffusion),其包含两个主数据流:字段流利用空间映射技术,将网络数据特征与稳定扩散模型的预训练领域相桥接,有效将复杂网络交互转换为可被稳定扩散处理的格式;空间流则采用动态时序建模方法,精细捕捉网络流量内在的时序模式。大量实验表明,与现有最优方案相比,本模型生成的数据与原始数据具有更高的统计相似性,并在广泛的下游任务中提升了性能表现。