Mental Modeling of Reinforcement Learning Agents by Language Models

Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models exhibit already some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pretrained models have memorized can be utilized to comprehend an agent's behaviour in the physical world. This study empirically examines, for the first time, how well large language models (LLMs) can build a mental model of agents, termed agent mental modelling, by reasoning about an agent's behaviour and its effect on states from agent interaction history. This research may unveil the potential of leveraging LLMs for elucidating RL agent behaviour, addressing a key challenge in eXplainable reinforcement learning (XRL). To this end, we propose specific evaluation metrics and test them on selected RL task datasets of varying complexity, reporting findings on agent mental model establishment. Our results disclose that LLMs are not yet capable of fully mental modelling agents through inference alone without further innovations. This work thus provides new insights into the capabilities and limitations of modern LLMs.

翻译：新兴语言模型能否忠实建模决策智能体的智能？尽管现代语言模型已展现出一定的推理能力，且理论上能够表达词元上的任意概率分布，但如何利用这些预训练模型所记忆的世界知识来理解智能体在物理世界中的行为，仍缺乏深入探索。本研究首次通过实证方法检验大型语言模型（LLMs）能否通过推理智能体的行为及其对状态的影响（基于智能体交互历史），建立对智能体的心智模型（称为智能体心智建模）。这项研究可能揭示利用LLMs阐释强化学习智能体行为的潜力，从而应对可解释强化学习（XRL）领域的关键挑战。为此，我们提出了专门的评估指标，并在不同复杂度的选定强化学习任务数据集上进行测试，报告了关于智能体心智模型建立的发现。我们的结果表明，若缺乏进一步创新，LLMs目前尚无法仅通过推理完全建立智能体的心智模型。因此，本研究为现代LLMs的能力与局限提供了新的见解。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/