Machine learning (ML) represents an efficient and popular approach for network traffic classification. However, network traffic classification is a challenging domain, and trained models may degrade soon after deployment due to the obsolete datasets and quick evolution of computer networks as new or updated protocols appear. Moreover, significant change in the behavior of a traffic type (and, therefore, the underlying features representing the traffic) can produce a large and sudden performance drop of the deployed model, known as a data or concept drift. In most cases, complete retraining is performed, often without further investigation of root causes, as good dataset quality is assumed. However, this is not always the case and further investigation must be performed. This paper proposes a novel methodology to evaluate the stability of datasets and a benchmark workflow that can be used to compare datasets. The proposed framework is based on a concept drift detection method that also uses ML feature weights to boost the detection performance. The benefits of this work are demonstrated on CESNET-TLS-Year22 dataset. We provide the initial dataset stability benchmark that is used to describe dataset stability and weak points to identify the next steps for optimization. Lastly, using the proposed benchmarking methodology, we show the optimization impact on the created dataset variants.
翻译:机器学习(ML)是网络流量分类中一种高效且流行的方法。然而,网络流量分类是一个具有挑战性的领域,由于数据集过时以及计算机网络随着新协议或更新协议的出现而快速演变,训练好的模型可能在部署后不久性能就会下降。此外,流量类型行为的显著变化(以及因此代表流量的底层特征的变化)可能导致已部署模型出现大幅且突然的性能下降,即数据漂移或概念漂移。在大多数情况下,通常会执行完全重新训练,而往往不进一步调查根本原因,因为假设数据集质量良好。然而,情况并非总是如此,必须进行更深入的调查。本文提出了一种评估数据集稳定性的新颖方法,以及一个可用于比较数据集的基准测试工作流程。所提出的框架基于一种概念漂移检测方法,该方法还利用机器学习特征权重来提升检测性能。这项工作在CESNET-TLS-Year22数据集上展示了其优势。我们提供了初始的数据集稳定性基准,用于描述数据集稳定性及薄弱环节,以确定后续优化步骤。最后,利用所提出的基准测试方法,我们展示了优化对所创建数据集变体的影响。