SoK: Decoding the Enigma of Encrypted Network Traffic Classifiers

The adoption of modern encryption protocols such as TLS 1.3 has significantly challenged traditional network traffic classification (NTC) methods. As a consequence, researchers are increasingly turning to machine learning (ML) approaches to overcome these obstacles. In this paper, we comprehensively analyze ML-based NTC studies, developing a taxonomy of their design choices, benchmarking suites, and prevalent assumptions impacting classifier performance. Through this systematization, we demonstrate widespread reliance on outdated datasets, oversights in design choices, and the consequences of unsubstantiated assumptions. Our evaluation reveals that the majority of proposed encrypted traffic classifiers have mistakenly utilized unencrypted traffic due to the use of legacy datasets. Furthermore, by conducting 348 feature occlusion experiments on state-of-the-art classifiers, we show how oversights in NTC design choices lead to overfitting, and validate or refute prevailing assumptions with empirical evidence. By highlighting lessons learned, we offer strategic insights, identify emerging research directions, and recommend best practices to support the development of real-world applicable NTC methodologies.

翻译：现代加密协议（如TLS 1.3）的广泛采用对传统网络流量分类方法构成了显著挑战。因此，研究者日益转向机器学习方法以克服这些障碍。本文系统分析了基于机器学习的网络流量分类研究，构建了涵盖其设计选择、基准测试套件及影响分类器性能的常见假设的分类体系。通过这种系统化梳理，我们揭示了该领域普遍存在的对过时数据集的依赖、设计选择中的疏漏以及未经证实的假设所导致的后果。评估结果表明，由于使用遗留数据集，大多数已提出的加密流量分类器实际上误用了未加密流量进行训练。此外，通过对前沿分类器进行348次特征遮蔽实验，我们展示了网络流量分类设计疏漏如何导致过拟合，并用实证证据验证或反驳了当前流行假设。通过总结经验教训，本文提出战略性见解，指明新兴研究方向，并推荐最佳实践以支持开发具有实际应用价值的网络流量分类方法。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日