Deep learning (DL) has been successfully applied to encrypted network traffic classification in experimental settings. However, in production use, it has been shown that a DL classifier's performance inevitably decays over time. Re-training the model on newer datasets has been shown to only partially improve its performance. Manually re-tuning the model architecture to meet the performance expectations on newer datasets is time-consuming and requires domain expertise. We propose AutoML4ETC, a novel tool to automatically design efficient and high-performing neural architectures for encrypted traffic classification. We define a novel, powerful search space tailored specifically for the near real-time classification of encrypted traffic using packet header bytes. We show that with different search strategies over our search space, AutoML4ETC generates neural architectures that outperform the state-of-the-art encrypted traffic classifiers on several datasets, including public benchmark datasets and real-world TLS and QUIC traffic collected from the Orange mobile network. In addition to being more accurate, AutoML4ETC's architectures are significantly more efficient and lighter in terms of the number of parameters. Finally, we make AutoML4ETC publicly available for future research.
翻译:深度学习(DL)已在实验环境下成功应用于加密网络流量分类。然而在实际生产应用中,DL分类器的性能会随时间推移不可避免地发生衰减。研究表明,仅在新数据集上重新训练模型只能部分提升其性能。针对新数据集手动调整模型架构以满足性能预期既耗时又需要领域专业知识。我们提出AutoML4ETC——一种能够自动设计高效且高性能神经架构的加密流量分类工具。我们定义了专用于基于数据包头字节的近实时加密流量分类的新型强大搜索空间。研究表明,通过在该搜索空间上采用不同搜索策略,AutoML4ETC生成的神经架构在多个数据集(包括公开基准数据集以及从Orange移动网络收集的真实世界TLS和QUIC流量)上均优于现有最先进的加密流量分类器。除了更高的准确率外,AutoML4ETC生成的架构在参数数量方面显著更高效、更轻量。最后,我们将公开AutoML4ETC以供未来研究使用。