RF fingerprinting is emerging as a physical layer security scheme to identify illegitimate and/or unauthorized emitters sharing the RF spectrum. However, due to the lack of publicly accessible real-world datasets, most research focuses on generating synthetic waveforms with software-defined radios (SDRs) which are not suited for practical deployment settings. On other hand, the limited datasets that are available focus only on chipsets that generate only one kind of waveform. Commercial off-the-shelf (COTS) combo chipsets that support two wireless standards (for example WiFi and Bluetooth) over a shared dual-band antenna such as those found in laptops, adapters, wireless chargers, Raspberry Pis, among others are becoming ubiquitous in the IoT realm. Hence, to keep up with the modern IoT environment, there is a pressing need for real-world open datasets capturing emissions from these combo chipsets transmitting heterogeneous communication protocols. To this end, we capture the first known emissions from the COTS IoT chipsets transmitting WiFi and Bluetooth under two different time frames. The different time frames are essential to rigorously evaluate the generalization capability of the models. To ensure widespread use, each capture within the comprehensive 72 GB dataset is long enough (40 MSamples) to support diverse input tensor lengths and formats. Finally, the dataset also comprises emissions at varying signal powers to account for the feeble to high signal strength emissions as encountered in a real-world setting.
翻译:射频指纹识别正成为一种物理层安全方案,用于识别共享射频频谱的非法和/或未授权发射器。然而,由于缺乏公开可访问的真实世界数据集,大多数研究集中于利用软件定义无线电生成不符合实际部署场景的合成波形。另一方面,现有有限数据集仅聚焦于产生单一波形类型的芯片组。支持双无线标准(如WiFi和蓝牙)并通过共享双频天线工作的商用现成组合芯片组——例如笔记本电脑、适配器、无线充电器、树莓派等设备中的芯片——在物联网领域正日益普及。因此,为适应现代物联网环境,迫切需要能捕获这些组合芯片组发射异构通信协议信号的真实世界开放数据集。为此,我们在两个不同时间帧下首次捕获了商用现成物联网芯片组发射WiFi和蓝牙的信号。不同时间帧对于严格评估模型的泛化能力至关重要。为确保广泛适用性,这个72GB综合数据集中每次捕获的时长(40兆采样点)足以支持多种输入张量长度和格式。最后,该数据集还包含不同信号功率下的发射数据,以涵盖实际环境中从微弱到高强度信号的各类场景。