The combination of the Bayesian game and learning has a rich history, with the idea of controlling a single agent in a system composed of multiple agents with unknown behaviors given a set of types, each specifying a possible behavior for the other agents. The idea is to plan an agent's own actions with respect to those types which it believes are most likely to maximize the payoff. However, the type beliefs are often learned from past actions and likely to be incorrect. With this perspective in mind, we consider an agent in a game with type predictions of other components, and investigate the impact of incorrect beliefs to the agent's payoff. In particular, we formally define a tradeoff between risk and opportunity by comparing the payoff obtained against the optimal payoff, which is represented by a gap caused by trusting or distrusting the learned beliefs. Our main results characterize the tradeoff by establishing upper and lower bounds on the Pareto front for both normal-form and stochastic Bayesian games, with numerical results provided.
翻译:贝叶斯博弈与学习的结合有着丰富的历史,其核心思想是在一个由多个具有未知行为的智能体组成的系统中,控制单个智能体,这些未知行为由一组类型给定,每种类型规定了其他智能体的一种可能行为。该思想旨在针对智能体认为最有可能最大化收益的那些类型来规划其自身行动。然而,类型信念通常是从过去的行动中学习得到的,很可能是不正确的。基于这一视角,我们考虑一个博弈中的智能体,它拥有对其他组件的类型预测,并研究错误信念对该智能体收益的影响。具体而言,我们通过比较所获收益与最优收益(表现为信任或不信任所学信念导致的差距),正式定义了风险与机遇之间的权衡。我们的主要结果通过为正规形式和随机贝叶斯博弈建立帕累托前沿的上下界来刻画这一权衡,并提供了数值结果。