Large Language Models (LLMs) have demonstrated remarkable capabilities across textual and visual domains but often generate outputs that violate physical laws, revealing a gap in their understanding of the physical world. Inspired by human cognition, where perception is fundamental to reasoning, we explore augmenting LLMs with enhanced perception abilities using Internet of Things (IoT) sensor data and pertinent knowledge for IoT task reasoning in the physical world. In this work, we systematically study LLMs capability to address real-world IoT tasks by augmenting their perception and knowledge base, and then propose a unified framework, IoT-LLM, to enhance such capability. In IoT-LLM, we customize three steps for LLMs: preprocessing IoT data into formats amenable to LLMs, activating their commonsense knowledge through chain-of-thought prompting and specialized role definitions, and expanding their understanding via IoT-oriented retrieval-augmented generation based on in-context learning. To evaluate the performance, We design a new benchmark with five real-world IoT tasks with different data types and reasoning difficulties and provide the benchmarking results on six open-source and close-source LLMs. Experimental results demonstrate the limitations of existing LLMs with naive textual inputs that cannot perform these tasks effectively. We show that IoT-LLM significantly enhances the performance of IoT tasks reasoning of LLM, such as GPT-4, achieving an average improvement of 65% across various tasks against previous methods. The results also showcase LLMs ability to comprehend IoT data and the physical law behind data by providing a reasoning process. Limitations of our work are claimed to inspire future research in this new era.
翻译:大型语言模型(LLMs)在文本和视觉领域已展现出卓越能力,但其输出常违反物理定律,揭示了模型对物理世界理解的不足。受人类认知中感知是推理基础的启发,我们探索利用物联网(IoT)传感器数据和相关知识增强LLMs的感知能力,以支持物理世界中的物联网任务推理。本研究系统性地考察了LLMs通过增强感知与知识库处理真实世界物联网任务的能力,并提出统一框架IoT-LLM以提升该能力。在IoT-LLM中,我们为LLMs定制了三个步骤:将物联网数据预处理为适合LLMs的格式;通过思维链提示和特定角色定义激活其常识知识;基于上下文学习,通过面向物联网的检索增强生成扩展其理解。为评估性能,我们设计了一个包含五种不同数据类型和推理难度的真实世界物联网任务的新基准,并在六个开源与闭源LLMs上提供了基准测试结果。实验结果表明,现有LLMs仅凭原始文本输入无法有效执行这些任务。我们证明IoT-LLM显著提升了LLMs(如GPT-4)的物联网任务推理性能,相较于先前方法在各类任务中平均提升65%。结果还展示了LLMs通过提供推理过程理解物联网数据及其背后物理规律的能力。本文亦指出研究局限,以激发这一新兴领域的未来探索。