Artificial intelligence (AI) agents embedded in environments with physics-based interaction face many challenges including reasoning, planning, summarization, and question answering. This problem is exacerbated when a human user wishes to either guide or interact with the agent in natural language. Although the use of Language Models (LMs) is the default choice, as an AI tool, they struggle with tasks involving physics. The LM's capability for physical reasoning is learned from observational data, rather than being grounded in simulation. A common approach is to include simulation traces as context, but this suffers from poor scalability as simulation traces contain larger volumes of fine-grained numerical and semantic data. In this paper, we propose a natural language guided method to discover coarse-grained patterns (e.g., 'rigid-body collision', 'stable support', etc.) from detailed simulation logs. Specifically, we synthesize programs that operate on simulation logs and map them to a series of high level activated patterns. We show, through two physics benchmarks, that this annotated representation of the simulation log is more amenable to natural language reasoning about physical systems. We demonstrate how this method enables LMs to generate effective reward programs from goals specified in natural language, which may be used within the context of planning or supervised learning.
翻译:嵌入物理交互环境的人工智能(AI)代理面临诸多挑战,包括推理、规划、摘要生成和问答等。当人类用户希望以自然语言引导或与代理交互时,这一问题尤为突出。尽管语言模型(LMs)作为AI工具是默认选择,但其在处理涉及物理的任务时存在困难。语言模型的物理推理能力源于对观测数据的学习,而非基于仿真环境。常见解决方案是将仿真轨迹作为上下文输入,但由于仿真轨迹包含海量细粒度数值与语义数据,该方法存在可扩展性不足的问题。本文提出一种自然语言引导的方法,用于从详细仿真日志中发现粗粒度模式(如“刚体碰撞”“稳定支撑”等)。具体而言,我们合成了可操作仿真日志的程序,并将其映射为一系列被激活的高层模式。通过两个物理基准测试,我们证明仿真日志经过此类标注表征后,更适用于对物理系统进行自然语言推理。我们进一步展示了该方法如何使语言模型能够根据自然语言描述的目标生成有效的奖励程序,这些程序可应用于规划或监督学习场景中。