A mediator observes no-regret learners playing an extensive-form game repeatedly across $T$ rounds. The mediator attempts to steer players toward some desirable predetermined equilibrium by giving (nonnegative) payments to players. We call this the steering problem. The steering problem captures problems several problems of interest, among them equilibrium selection and information design (persuasion). If the mediator's budget is unbounded, steering is trivial because the mediator can simply pay the players to play desirable actions. We study two bounds on the mediator's payments: a total budget and a per-round budget. If the mediator's total budget does not grow with $T$, we show that steering is impossible. However, we show that it is enough for the total budget to grow sublinearly with $T$, that is, for the average payment to vanish. When players' full strategies are observed at each round, we show that constant per-round budgets permit steering. In the more challenging setting where only trajectories through the game tree are observable, we show that steering is impossible with constant per-round budgets in general extensive-form games, but possible in normal-form games or if the per-round budget may itself depend on $T$. We also show how our results can be generalized to the case when the equilibrium is being computed online while steering is happening. We supplement our theoretical positive results with experiments highlighting the efficacy of steering in large games.
翻译:一位中介观察无遗憾学习器在$T$轮博弈中重复进行扩展形式博弈。该中介试图通过向玩家提供(非负)支付,引导玩家走向某个预先设定的理想均衡,我们将此称为引导问题。引导问题涵盖了若干值得关注的问题,包括均衡选择和信息设计(说服)。若中介的预算不受限制,引导便变得简单,因为中介可直接向玩家支付以使其采取理想行动。我们研究了中介支付的两个约束:总预算和每轮预算。若中介的总预算不随$T$增长,我们证明引导是不可能的。然而,我们表明总预算只需随$T$次线性增长(即平均支付趋于零)便足以实现引导。当每轮中玩家的完整策略可被观测时,我们证明固定的每轮预算允许引导。在更具挑战性的情境中,仅能观测到博弈树上的轨迹时,我们证明在一般的扩展形式博弈中,固定每轮预算无法实现引导,但在标准形式博弈中,或者当每轮预算本身可依赖于$T$时,引导是可能的。我们还展示了当均衡在引导过程中以在线方式计算时,我们的结果如何推广。我们通过实验补充理论正向结果,突显引导在大型博弈中的有效性。