Large Language Models (LLMs) have demonstrated remarkable capabilities across textual and visual domains but often generate outputs that violate physical laws, revealing a gap in their understanding of the physical world. Inspired by human cognition, where perception is fundamental to reasoning, we explore augmenting LLMs with enhanced perception abilities using Internet of Things (IoT) sensor data and pertinent knowledge for IoT task reasoning in the physical world. In this work, we systematically study LLMs capability to address real-world IoT tasks by augmenting their perception and knowledge base, and then propose a unified framework, IoT-LLM, to enhance such capability. In IoT-LLM, we customize three steps for LLMs: preprocessing IoT data into formats amenable to LLMs, activating their commonsense knowledge through chain-of-thought prompting and specialized role definitions, and expanding their understanding via IoT-oriented retrieval-augmented generation based on in-context learning. To evaluate the performance, We design a new benchmark with five real-world IoT tasks with different data types and reasoning difficulties and provide the benchmarking results on six open-source and close-source LLMs. Experimental results demonstrate the limitations of existing LLMs with naive textual inputs that cannot perform these tasks effectively. We show that IoT-LLM significantly enhances the performance of IoT tasks reasoning of LLM, such as GPT-4, achieving an average improvement of 65% across various tasks against previous methods. The results also showcase LLMs ability to comprehend IoT data and the physical law behind data by providing a reasoning process. Limitations of our work are claimed to inspire future research in this new era.
翻译:大型语言模型(LLM)在文本和视觉领域展现出卓越能力,但其输出常违背物理定律,这揭示了其对物理世界理解的不足。受人类认知中感知是推理基础的启发,我们探索利用物联网(IoT)传感器数据和相关知识增强LLM的感知能力,以支持其在物理世界中的物联网任务推理。本研究系统性地探讨了通过增强感知与知识库来提升LLM处理真实世界物联网任务的能力,并提出一个统一框架——IoT-LLM,以强化此项能力。在IoT-LLM中,我们为LLM定制了三个步骤:将物联网数据预处理为适合LLM的格式;通过思维链提示和特定角色定义激活其常识知识;以及基于上下文学习,通过面向物联网的检索增强生成扩展其理解。为评估性能,我们设计了一个包含五种不同数据类型和推理难度的真实世界物联网任务的新基准,并在六个开源与闭源LLM上提供了基准测试结果。实验结果表明,现有LLM仅凭原始文本输入无法有效执行这些任务。我们证明IoT-LLM显著提升了LLM(如GPT-4)在物联网任务推理上的性能,相比先前方法在各种任务中平均提升65%。结果还展示了LLM通过提供推理过程来理解物联网数据及其背后物理规律的能力。本文亦指出了工作的局限性,以期激发这一新兴领域的未来研究。