In many real-world settings agents engage in strategic interactions with multiple opposing agents who can employ a wide variety of strategies. The standard approach for designing agents for such settings is to compute or approximate a relevant game-theoretic solution concept such as Nash equilibrium and then follow the prescribed strategy. However, such a strategy ignores any observations of opponents' play, which may indicate shortcomings that can be exploited. We present an approach for opponent modeling in multiplayer imperfect-information games where we collect observations of opponents' play through repeated interactions. We run experiments against a wide variety of real opponents and exact Nash equilibrium strategies in three-player Kuhn poker and show that our algorithm significantly outperforms all of the agents, including the exact Nash equilibrium strategies.
翻译:在许多现实场景中,智能体需要与可能采用多种策略的多个对手进行策略性互动。设计这类场景智能体的标准方法是计算或近似相关博弈论解概念(如纳什均衡),然后遵循规定的策略。然而,这种策略忽视了对手行为的任何观测信息,而这些信息可能揭示可供利用的缺陷。我们提出了一种用于多人非完美信息博弈的对手建模方法,通过重复博弈收集对手的交互数据。我们在三人Kuhn扑克中对多种真实对手及精确纳什均衡策略进行了实验,结果表明我们的算法显著优于所有智能体(包括精确纳什均衡策略)。