Stackelberg games (SGs) constitute the most fundamental and acclaimed models of strategic interactions involving some form of commitment. Moreover, they form the basis of more elaborate models of this kind, such as, e.g., Bayesian persuasion and principal-agent problems. Addressing learning tasks in SGs and related models is crucial to operationalize them in practice, where model parameters are usually unknown. In this paper, we revise the sample complexity of learning an optimal strategy to commit to in SGs. We provide a novel algorithm that (i) does not require any of the limiting assumptions made by state-of-the-art approaches and (ii) deals with a trade-off between sample complexity and termination probability arising when leader's strategies representation has finite precision. Such a trade-off has been completely neglected by existing algorithms and, if not properly managed, it may result in them using exponentially-many samples. Our algorithm requires novel techniques, which also pave the way to addressing learning problems in other models with commitment ubiquitous in the real world.
翻译:Stackelberg博弈(SGs)构成了涉及某种承诺的战略互动中最基本且最受认可的模型。此外,它们还是此类更精细模型(例如贝叶斯说服与委托-代理问题)的基础。解决SGs及相关模型中的学习任务对于在实际应用中实现这些模型至关重要,因为实际中的模型参数通常是未知的。本文重新审视了在SGs中学习最优承诺策略的样本复杂度。我们提出了一种新颖算法,该算法(i)无需现有前沿方法所依赖的任何限制性假设,且(ii)处理了领导者策略表示具有有限精度时样本复杂度与终止概率之间的权衡。这种权衡完全被现有算法所忽视,若管理不当,可能导致这些算法使用指数级数量的样本。我们的算法需要创新性技术,这也为解决现实世界中普遍存在的其他承诺模型中的学习问题铺平了道路。