Leveraging Machine Learning for Accurate IoT Device Identification in Dynamic Wireless Contexts

Identifying IoT devices is crucial for network monitoring, security enforcement, and inventory tracking. However, most existing identification methods rely on deep packet inspection, which raises privacy concerns and adds computational complexity. More importantly, existing works overlook the impact of wireless channel dynamics on the accuracy of layer-2 features, thereby limiting their effectiveness in real-world scenarios. In this work, we define and use the latency of specific probe-response packet exchanges, referred to as "device latency," as the main feature for device identification. Additionally, we reveal the critical impact of wireless channel dynamics on the accuracy of device identification based on device latency. Specifically, this work introduces "accumulation score" as a novel approach to capturing fine-grained channel dynamics and their impact on device latency when training machine learning models. We implement the proposed methods and measure the accuracy and overhead of device identification in real-world scenarios. The results confirm that by incorporating the accumulation score for balanced data collection and training machine learning algorithms, we achieve an F1 score of over 97% for device identification, even amidst wireless channel dynamics, a significant improvement over the 75% F1 score achieved by disregarding the impact of channel dynamics on data collection and device latency.

翻译：物联网设备识别对于网络监控、安全策略实施和资产盘点追踪至关重要。然而，现有的大多数识别方法依赖于深度包检测技术，这不仅引发了隐私担忧，还增加了计算复杂度。更重要的是，现有研究忽视了无线信道动态变化对二层特征准确性的影响，从而限制了其在真实场景中的有效性。在本研究中，我们定义并利用特定探测-响应数据包交换的延迟（称为“设备延迟”）作为设备识别的主要特征。此外，我们揭示了无线信道动态变化对基于设备延迟的设备识别准确性的关键影响。具体而言，本研究引入“累积分数”作为一种新方法，用于在训练机器学习模型时捕获细粒度的信道动态及其对设备延迟的影响。我们实现了所提出的方法，并在真实场景中测量了设备识别的准确性和开销。结果证实，通过结合累积分数进行平衡数据采集并训练机器学习算法，即使在无线信道动态变化的环境中，设备识别的F1分数也超过了97%，相较于忽略信道动态对数据采集和设备延迟影响时仅获得的75% F1分数，这是一个显著的提升。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日