Effective extraction of the world knowledge in LLMs for complex decision-making tasks remains a challenge. We propose a framework PIANIST for decomposing the world model into seven intuitive components conducive to zero-shot LLM generation. Given only the natural language description of the game and how input observations are formatted, our method can generate a working world model for fast and efficient MCTS simulation. We show that our method works well on two different games that challenge the planning and decision making skills of the agent for both language and non-language based action taking, without any training on domain-specific training data or explicitly defined world model.
翻译:从大语言模型中有效提取世界知识以解决复杂决策任务仍具挑战性。我们提出PIANIST框架,将世界模型分解为七个利于大语言模型零样本生成的直观组件。仅需游戏的自然语言描述及输入观测的格式说明,本方法即可生成可用于快速高效蒙特卡洛树搜索仿真的工作世界模型。实验表明,该方法在两种不同游戏中表现良好——这些游戏对智能体在语言与非语言动作执行中的规划与决策能力均构成挑战,且无需领域特定训练数据或显式定义的世界模型。