Bayesian persuasion studies how an informed sender should influence beliefs of rational receivers who take decisions through Bayesian updating of a common prior. We focus on the online Bayesian persuasion framework, in which the sender repeatedly faces one or more receivers with unknown and adversarially selected types. First, we show how to obtain a tight $\tilde O(T^{1/2})$ regret bound in the case in which the sender faces a single receiver and has partial feedback, improving over the best previously known bound of $\tilde O(T^{4/5})$. Then, we provide the first no-regret guarantees for the multi-receiver setting under partial feedback. Finally, we show how to design no-regret algorithms with polynomial per-iteration running time by exploiting type reporting, thereby circumventing known intractability results on online Bayesian persuasion. We provide efficient algorithms guaranteeing a $O(T^{1/2})$ regret upper bound both in the single- and multi-receiver scenario when type reporting is allowed.
翻译:贝叶斯说服研究的是,一位掌握信息的发送者应如何影响理性接收者的信念,这些接收者通过共同先验的贝叶斯更新来做出决策。我们聚焦于在线贝叶斯说服框架,其中发送者反复面对一个或多个类型未知且由对手选择的接收者。首先,我们展示了在面对单一接收者且反馈部分可观察的情况下,如何获得紧致的 $\tilde O(T^{1/2})$ 遗憾界,这改进了此前已知的最佳 $\tilde O(T^{4/5})$ 界。其次,我们首次提供了部分反馈下多接收者场景的无遗憾保证。最后,我们通过利用类型报告机制,展示了如何设计每轮运行时间为多项式复杂度的无遗憾算法,从而规避了在线贝叶斯说服中已知的难解性问题。我们提供了高效算法,在允许类型报告时,对单一和多接收者场景均能保证 $O(T^{1/2})$ 的遗憾上界。