In computer networking, network traffic refers to the amount of data transmitted in the form of packets between internetworked computers or Cyber-Physical Systems. Monitoring and analyzing network traffic is crucial for ensuring the performance, security, and reliability of a network. However, a significant challenge in network traffic analysis is to process diverse data packets including both ciphertext and plaintext. While many methods have been adopted to analyze network traffic, they often rely on different datasets for performance evaluation. This inconsistency results in substantial manual data processing efforts and unfair comparisons. Moreover, some data processing methods may cause data leakage due to improper separation of training and testing data. To address these issues, we introduce the NetBench, a large-scale and comprehensive benchmark dataset for assessing machine learning models, especially foundation models, in both network traffic classification and generation tasks. NetBench is built upon seven publicly available datasets and encompasses a broad spectrum of 20 tasks, including 15 classification tasks and 5 generation tasks. Furthermore, we evaluate eight State-Of-The-Art (SOTA) classification models (including two foundation models) and two generative models using our benchmark. The results show that foundation models significantly outperform the traditional deep learning methods in traffic classification. We believe NetBench will facilitate fair comparisons among various approaches and advance the development of foundation models for network traffic. Our benchmark is available at https://github.com/WM-JayLab/NetBench.
翻译:在计算机网络中,网络流量指互联计算机或信息物理系统间以数据包形式传输的数据量。监测与分析网络流量对于确保网络的性能、安全性和可靠性至关重要。然而,网络流量分析面临的一个重大挑战是需要处理包含密文和明文在内的多样化数据包。尽管已有多种方法用于分析网络流量,但它们通常依赖不同的数据集进行性能评估。这种不一致性导致了大量人工数据处理工作以及不公平的比较。此外,某些数据处理方法因训练数据和测试数据划分不当可能引发数据泄露。为解决这些问题,我们提出了NetBench——一个用于评估机器学习模型(特别是基础模型)在大规模网络流量分类与生成任务中性能的综合基准数据集。NetBench基于七个公开数据集构建,涵盖20个广泛任务,包括15个分类任务和5个生成任务。此外,我们使用该基准评估了八种最先进的分类模型(含两种基础模型)和两种生成模型。结果表明,基础模型在流量分类任务中显著优于传统深度学习方法。我们相信NetBench将促进各类方法间的公平比较,并推动面向网络流量的基础模型发展。我们的基准数据集可通过https://github.com/WM-JayLab/NetBench获取。