Identifying IoT devices is crucial for network monitoring, security enforcement, and inventory tracking. However, most existing identification methods rely on deep packet inspection, which raises privacy concerns and adds computational complexity. More importantly, existing works overlook the impact of wireless channel dynamics on the accuracy of layer-2 features, thereby limiting their effectiveness in real-world scenarios. In this work, we define and use the latency of specific probe-response packet exchanges, referred to as "device latency," as the main feature for device identification. Additionally, we reveal the critical impact of wireless channel dynamics on the accuracy of device identification based on device latency. Specifically, this work introduces "accumulation score" as a novel approach to capturing fine-grained channel dynamics and their impact on device latency when training machine learning models. We implement the proposed methods and measure the accuracy and overhead of device identification in real-world scenarios. The results confirm that by incorporating the accumulation score for balanced data collection and training machine learning algorithms, we achieve an F1 score of over 97% for device identification, even amidst wireless channel dynamics, a significant improvement over the 75% F1 score achieved by disregarding the impact of channel dynamics on data collection and device latency.
翻译:物联网设备识别对于网络监控、安全策略实施和资产盘点追踪至关重要。然而,现有的大多数识别方法依赖于深度包检测技术,这不仅引发了隐私担忧,还增加了计算复杂度。更重要的是,现有研究忽视了无线信道动态变化对二层特征准确性的影响,从而限制了其在真实场景中的有效性。在本研究中,我们定义并利用特定探测-响应数据包交换的延迟(称为“设备延迟”)作为设备识别的主要特征。此外,我们揭示了无线信道动态变化对基于设备延迟的设备识别准确性的关键影响。具体而言,本研究引入“累积分数”作为一种新方法,用于在训练机器学习模型时捕获细粒度的信道动态及其对设备延迟的影响。我们实现了所提出的方法,并在真实场景中测量了设备识别的准确性和开销。结果证实,通过结合累积分数进行平衡数据采集并训练机器学习算法,即使在无线信道动态变化的环境中,设备识别的F1分数也超过了97%,相较于忽略信道动态对数据采集和设备延迟影响时仅获得的75% F1分数,这是一个显著的提升。