RF fingerprinting is emerging as a physical layer security scheme to identify illegitimate and/or unauthorized emitters sharing the RF spectrum. However, due to the lack of publicly accessible real-world datasets, most research focuses on generating synthetic waveforms with software-defined radios (SDRs) which are not suited for practical deployment settings. On other hand, the limited datasets that are available focus only on chipsets that generate only one kind of waveform. Commercial off-the-shelf (COTS) combo chipsets that support two wireless standards (for example WiFi and Bluetooth) over a shared dual-band antenna such as those found in laptops, adapters, wireless chargers, Raspberry Pis, among others are becoming ubiquitous in the IoT realm. Hence, to keep up with the modern IoT environment, there is a pressing need for real-world open datasets capturing emissions from these combo chipsets transmitting heterogeneous communication protocols. To this end, we capture the first known emissions from the COTS IoT chipsets transmitting WiFi and Bluetooth under two different time frames. The different time frames are essential to rigorously evaluate the generalization capability of the models. To ensure widespread use, each capture within the comprehensive 72 GB dataset is long enough (40 MSamples) to support diverse input tensor lengths and formats. Finally, the dataset also comprises emissions at varying signal powers to account for the feeble to high signal strength emissions as encountered in a real-world setting.
翻译:射频指纹识别正成为物理层安全方案,用于识别共享射频频谱中的非法和/或未经授权发射器。然而,由于缺乏公开可用的真实世界数据集,多数研究集中于利用软件定义无线电(SDR)生成合成波形,这并不适合实际部署场景。另一方面,现有有限数据集仅关注可生成单一种类波形的芯片组。支持两种无线标准(如WiFi和蓝牙)并共享双频天线的商用现成(COTS)组合芯片组(常见于笔记本电脑、适配器、无线充电器、树莓派等设备)在物联网领域日益普及。因此,为适应现代物联网环境,迫切需要能够捕获这些组合芯片组传输异构通信协议发射信号的真实世界开放数据集。为此,我们首次捕获了COTS物联网芯片组在两个不同时间段内传输WiFi和蓝牙的发射信号。不同时间段对于严格评估模型的泛化能力至关重要。为确保广泛适用性,该72GB综合数据集中每个捕获信号时长足够(40兆采样点),以支持多样化的输入张量长度与格式。最后,该数据集还包含不同信号功率下的发射信号,以涵盖真实场景中从微弱到高强度的信号发射情形。