Leveraging Foundation Models for Zero-Shot IoT Sensing

Deep learning models are increasingly deployed on edge Internet of Things (IoT) devices. However, these models typically operate under supervised conditions and fail to recognize unseen classes different from training. To address this, zero-shot learning (ZSL) aims to classify data of unseen classes with the help of semantic information. Foundation models (FMs) trained on web-scale data have shown impressive ZSL capability in natural language processing and visual understanding. However, leveraging FMs' generalized knowledge for zero-shot IoT sensing using signals such as mmWave, IMU, and Wi-Fi has not been fully investigated. In this work, we align the IoT data embeddings with the semantic embeddings generated by an FM's text encoder for zero-shot IoT sensing. To utilize the physics principles governing the generation of IoT sensor signals to derive more effective prompts for semantic embedding extraction, we propose to use cross-attention to combine a learnable soft prompt that is optimized automatically on training data and an auxiliary hard prompt that encodes domain knowledge of the IoT sensing task. To address the problem of IoT embeddings biasing to seen classes due to the lack of unseen class data during training, we propose using data augmentation to synthesize unseen class IoT data for fine-tuning the IoT feature extractor and embedding projector. We evaluate our approach on multiple IoT sensing tasks. Results show that our approach achieves superior open-set detection and generalized zero-shot learning performance compared with various baselines. Our code is available at https://github.com/schrodingho/FM\_ZSL\_IoT.

翻译：深度学习模型正日益部署在边缘物联网设备上。然而，这些模型通常在监督条件下运行，无法识别与训练类别不同的未见类别。为解决此问题，零样本学习旨在借助语义信息对未见类别的数据进行分类。基于网络规模数据训练的基础模型已在自然语言处理和视觉理解领域展现出卓越的零样本学习能力。然而，如何利用基础模型的泛化知识，通过毫米波、惯性测量单元和Wi-Fi等信号实现零样本物联网感知，尚未得到充分研究。在本工作中，我们将物联网数据嵌入与基础模型文本编码器生成的语义嵌入对齐，以实现零样本物联网感知。为利用支配物联网传感器信号生成的物理原理来获取更有效的语义嵌入提取提示，我们提出使用交叉注意力机制，将基于训练数据自动优化的可学习软提示与编码物联网感知任务领域知识的辅助硬提示相结合。针对训练过程中因缺乏未见类别数据导致物联网嵌入向已见类别偏移的问题，我们提出通过数据增强合成未见类别的物联网数据，以微调物联网特征提取器和嵌入投影器。我们在多个物联网感知任务上评估了所提方法。结果表明，与多种基线方法相比，我们的方法在开放集检测和广义零样本学习性能方面均表现出优越性。代码发布于 https://github.com/schrodingho/FM\_ZSL\_IoT。