Recent research works have proposed machine learning models for classifying IoT devices connected to a network. However, there is still a practical challenge of not having all devices (and hence their traffic) available during the training of a model. This essentially means, during the operational phase, we need to classify new devices not seen in the training phase. To address this challenge, we propose ZEST -- a ZSL (zero-shot learning) framework based on self-attention for classifying both seen and unseen devices. ZEST consists of i) a self-attention based network feature extractor, termed SANE, for extracting latent space representations of IoT traffic, ii) a generative model that trains a decoder using latent features to generate pseudo data, and iii) a supervised model that is trained on the generated pseudo data for classifying devices. We carry out extensive experiments on real IoT traffic data; our experiments demonstrate i) ZEST achieves significant improvement (in terms of accuracy) over the baselines; ii) SANE is able to better extract meaningful representations than LSTM which has been commonly used for modeling network traffic.
翻译:近期研究工作提出了用于对网络中连接的物联网设备进行分类的机器学习模型。然而,在实际应用中仍存在一个挑战:模型训练时无法获取所有设备(及其流量)。这本质上意味着,在运行阶段,我们需要对训练阶段未见过的新设备进行分类。为应对这一挑战,我们提出了ZEST——一种基于自注意力的零样本学习(ZSL)框架,用于分类已见和未见设备。ZEST包含:i)基于自注意力的网络特征提取器(称为SANE),用于提取物联网流量的潜在空间表示;ii)一个生成模型,利用潜在特征训练解码器以生成伪数据;iii)一个监督模型,基于生成的伪数据训练以对设备进行分类。我们在真实物联网流量数据上进行了大量实验,结果表明:i)ZEST在准确率上相较于基线方法有显著提升;ii)SANE能比常用于网络流量建模的LSTM更有效地提取有意义的表示。