There is an ongoing debate regarding the potential of Large Language Models (LLMs) as foundational models seamlessly integrated with Cyber-Physical Systems (CPS) for interpreting the physical world. In this paper, we carry out a case study to answer the following question: Are LLMs capable of zero-shot human activity recognition (HAR). Our study, HARGPT, presents an affirmative answer by demonstrating that LLMs can comprehend raw IMU data and perform HAR tasks in a zero-shot manner, with only appropriate prompts. HARGPT inputs raw IMU data into LLMs and utilizes the role-play and think step-by-step strategies for prompting. We benchmark HARGPT on GPT4 using two public datasets of different inter-class similarities and compare various baselines both based on traditional machine learning and state-of-the-art deep classification models. Remarkably, LLMs successfully recognize human activities from raw IMU data and consistently outperform all the baselines on both datasets. Our findings indicate that by effective prompting, LLMs can interpret raw IMU data based on their knowledge base, possessing a promising potential to analyze raw sensor data of the physical world effectively.
翻译:关于大语言模型(LLMs)作为基础模型与信息物理系统(CPS)无缝融合以解读物理世界的潜力,学界存在持续争论。本文通过案例研究回答以下问题:LLMs能否实现零样本人类活动识别(HAR)?我们的研究HARGPT给出了肯定答案,证明仅需适当提示,LLMs即可理解原始惯性测量单元(IMU)数据并执行零样本HAR任务。HARGPT将原始IMU数据输入LLMs,采用角色扮演与逐步推理策略进行提示。我们使用两个类间相似度不同的公开数据集,在GPT4上对HARGPT进行基准测试,并与基于传统机器学习及最先进深度分类模型的多种基线方法进行比较。值得注意的是,LLMs成功从原始IMU数据中识别出人类活动,并且在两个数据集上均持续优于所有基线方法。研究结果表明,通过有效提示,LLMs能够基于自身知识库解读原始IMU数据,展现出有效分析物理世界原始传感器数据的巨大潜力。