Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4

Unlike perfect information games, where all elements are known to every player, imperfect information games emulate the real-world complexities of decision-making under uncertain or incomplete information. GPT-4, the recent breakthrough in large language models (LLMs) trained on massive passive data, is notable for its knowledge retrieval and reasoning abilities. This paper delves into the applicability of GPT-4's learned knowledge for imperfect information games. To achieve this, we introduce \textbf{Suspicion-Agent}, an innovative agent that leverages GPT-4's capabilities for performing in imperfect information games. With proper prompt engineering to achieve different functions, Suspicion-Agent based on GPT-4 demonstrates remarkable adaptability across a range of imperfect information card games. Importantly, GPT-4 displays a strong high-order theory of mind (ToM) capacity, meaning it can understand others and intentionally impact others' behavior. Leveraging this, we design a planning strategy that enables GPT-4 to competently play against different opponents, adapting its gameplay style as needed, while requiring only the game rules and descriptions of observations as input. In the experiments, we qualitatively showcase the capabilities of Suspicion-Agent across three different imperfect information games and then quantitatively evaluate it in Leduc Hold'em. The results show that Suspicion-Agent can potentially outperform traditional algorithms designed for imperfect information games, without any specialized training or examples. In order to encourage and foster deeper insights within the community, we make our game-related data publicly available.

翻译：与完全信息博弈中所有参与者知晓全部要素不同，不完全信息博弈模拟了在不确定或不完整信息下决策的真实世界复杂性。GPT-4作为近期在大规模被动数据上训练的大型语言模型（LLMs）的突破性成果，以其知识检索与推理能力著称。本文探究GPT-4习得的知识在不完全信息博弈中的适用性。为此，我们提出**怀疑代理**（Suspicion-Agent），一种利用GPT-4能力进行不完全信息博弈的创新性智能体。通过精心设计的提示工程实现不同功能，基于GPT-4的怀疑代理在一系列不完全信息纸牌游戏中展现出卓越的适应性。尤为重要的是，GPT-4表现出强大的高阶心智理论（ToM）能力，即能理解他人并有意影响他人行为。基于此，我们设计了一种规划策略，使GPT-4能够在仅输入游戏规则和观察描述的情况下，灵活调整游戏风格，胜任与不同对手的对弈。实验中，我们定性地展示了怀疑代理在三种不同不完全信息游戏中的能力，并在Leduc Hold'em中对其进行定量评估。结果表明，无需任何专门训练或示例，怀疑代理可能超越为不完全信息博弈设计的传统算法。为鼓励并促进学界更深入的洞察，我们公开了相关游戏数据。