Deep learning (DL) has been successfully applied to encrypted network traffic classification in experimental settings. However, in production use, it has been shown that a DL classifier's performance inevitably decays over time. Re-training the model on newer datasets has been shown to only partially improve its performance. Manually re-tuning the model architecture to meet the performance expectations on newer datasets is time-consuming and requires domain expertise. We propose AutoML4ETC, a novel tool to automatically design efficient and high-performing neural architectures for encrypted traffic classification. We define a novel, powerful search space tailored specifically for the early classification of encrypted traffic using packet header bytes. We show that with different search strategies over our search space, AutoML4ETC generates neural architectures that outperform the state-of-the-art encrypted traffic classifiers on several datasets, including public benchmark datasets and real-world TLS and QUIC traffic collected from the Orange mobile network. In addition to being more accurate, AutoML4ETC's architectures are significantly more efficient and lighter in terms of the number of parameters. Finally, we make AutoML4ETC publicly available for future research.
翻译:深度学习(DL)在实验场景中已成功应用于加密网络流量分类。然而,在实际生产环境中,DL分类器的性能会不可避免地随时间衰减。研究表明,仅在新数据集上重新训练模型只能部分改善其性能。针对新数据集手动调整模型架构以满足性能预期既耗时又需要领域专业知识。我们提出AutoML4ETC这一新型工具,可自动设计高效且性能优越的神经架构用于加密流量分类。我们针对基于数据包头字节的加密流量早期分类任务,定义了一个新颖且功能强大的搜索空间。实验表明,通过在该搜索空间上采用不同搜索策略,AutoML4ETC生成的神经架构在多个数据集(包括公开基准数据集以及从Orange移动网络采集的真实世界TLS和QUIC流量)上均优于现有最先进的加密流量分类器。除精度更高外,AutoML4ETC生成的架构在参数数量上显著更高效、更轻量。最后,我们将AutoML4ETC公开供未来研究使用。