Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4

Unlike perfect information games, where all elements are known to every player, imperfect information games emulate the real-world complexities of decision-making under uncertain or incomplete information. GPT-4, the recent breakthrough in large language models (LLMs) trained on massive passive data, is notable for its knowledge retrieval and reasoning abilities. This paper delves into the applicability of GPT-4's learned knowledge for imperfect information games. To achieve this, we introduce \textbf{Suspicion-Agent}, an innovative agent that leverages GPT-4's capabilities for performing in imperfect information games. With proper prompt engineering to achieve different functions, Suspicion-Agent based on GPT-4 demonstrates remarkable adaptability across a range of imperfect information card games. Importantly, GPT-4 displays a strong high-order theory of mind (ToM) capacity, meaning it can understand others and intentionally impact others' behavior. Leveraging this, we design a planning strategy that enables GPT-4 to competently play against different opponents, adapting its gameplay style as needed, while requiring only the game rules and descriptions of observations as input. In the experiments, we qualitatively showcase the capabilities of Suspicion-Agent across three different imperfect information games and then quantitatively evaluate it in Leduc Hold'em. The results show that Suspicion-Agent can potentially outperform traditional algorithms designed for imperfect information games, without any specialized training or examples. In order to encourage and foster deeper insights within the community, we make our game-related data publicly available.

翻译：不同于所有玩家均知晓全部要素的完美信息博弈，非完美信息博弈模拟了在不确定或不完整信息条件下决策的现实复杂性。GPT-4作为近期突破性的大语言模型（LLM），在海量被动数据上训练而成，以其知识检索与推理能力著称。本文深入探讨了GPT-4所学知识在非完美信息博弈中的适用性。为此，我们提出**Suspicion-Agent**——一种创新性代理，利用GPT-4的能力完成非完美信息博弈。通过合理的提示工程实现不同功能，基于GPT-4的Suspicion-Agent在一系列非完美信息纸牌博弈中展现出显著的适应性。更重要的是，GPT-4表现出强大的高阶心智理论（ToM）能力，能够理解他人并有意影响他人行为。基于此，我们设计了一种规划策略，使GPT-4能够根据不同对手灵活调整游戏风格，仅需输入游戏规则和观测描述即可胜任对决。实验部分，我们通过三组不同的非完美信息博弈定性展示了Suspicion-Agent的能力，随后在Leduc德州扑克中进行了定量评估。结果表明，无需任何专门训练或示例，Suspicion-Agent即可在非完美信息博弈中可能超越专为该类博弈设计的传统算法。为鼓励和促进学术界更深层次的见解，我们将相关博弈数据公开提供。