Recent research works have proposed machine learning models for classifying IoT devices connected to a network. However, there is still a practical challenge of not having all devices (and hence their traffic) available during the training of a model. This essentially means, during the operational phase, we need to classify new devices not seen during the training phase. To address this challenge, we propose ZEST -- a ZSL (zero-shot learning) framework based on self-attention for classifying both seen and unseen devices. ZEST consists of i) a self-attention based network feature extractor, termed SANE, for extracting latent space representations of IoT traffic, ii) a generative model that trains a decoder using latent features to generate pseudo data, and iii) a supervised model that is trained on the generated pseudo data for classifying devices. We carry out extensive experiments on real IoT traffic data; our experiments demonstrate i) ZEST achieves significant improvement (in terms of accuracy) over the baselines; ii) ZEST is able to better extract meaningful representations than LSTM which has been commonly used for modeling network traffic.
翻译:近年来的研究提出了用于对网络中物联网设备进行分类的机器学习模型。然而,在实际应用中仍然存在一个挑战:在模型训练阶段,并非所有设备(及其流量数据)都可用。这意味着在运行阶段,我们需要对训练阶段未见过的全新设备进行分类。为解决这一挑战,我们提出ZEST——一种基于自注意力机制的零样本学习(ZSL)框架,用于对已知和未知设备进行分类。ZEST包含三个模块:i)基于自注意力的网络特征提取器(SANE),用于提取物联网流量的潜在空间表征;ii)生成模型,利用潜在特征训练解码器以生成伪数据;iii)监督学习模型,基于生成的伪数据训练以实现设备分类。我们在真实物联网流量数据上进行了大量实验。实验结果表明:i)ZEST在准确率上相较基准方法取得显著提升;ii)与常用于网络流量建模的LSTM相比,ZEST能更有效地提取具有意义的表征。