Persuading a Behavioral Agent: Approximately Best Responding and Learning

The classic Bayesian persuasion model assumes a Bayesian and best-responding receiver. We study a relaxation of the Bayesian persuasion model where the receiver can approximately best respond to the sender's signaling scheme. We show that, under natural assumptions, (1) the sender can find a signaling scheme that guarantees itself an expected utility almost as good as its optimal utility in the classic model, no matter what approximately best-responding strategy the receiver uses; (2) on the other hand, there is no signaling scheme that gives the sender much more utility than its optimal utility in the classic model, even if the receiver uses the approximately best-responding strategy that is best for the sender. Together, (1) and (2) imply that the approximately best-responding behavior of the receiver does not affect the sender's maximal achievable utility a lot in the Bayesian persuasion problem. The proofs of both results rely on the idea of robustification of a Bayesian persuasion scheme: given a pair of the sender's signaling scheme and the receiver's strategy, we can construct another signaling scheme such that the receiver prefers to use that strategy in the new scheme more than in the original scheme, and the two schemes give the sender similar utilities. As an application of our main result (1), we show that, in a repeated Bayesian persuasion model where the receiver learns to respond to the sender by some algorithms, the sender can do almost as well as in the classic model. Interestingly, unlike (2), with a learning receiver the sender can sometimes do much better than in the classic model.

翻译：经典贝叶斯说服模型假设接收者是贝叶斯且最优响应的。我们研究贝叶斯说服模型的一种松弛情形，其中接收者可对发送者的信号方案进行近似最优响应。我们证明，在自然假设下：(1) 无论接收者采用何种近似最优响应策略，发送者总能找到一个信号方案，使其期望效用几乎等同于经典模型中的最优效用；(2) 另一方面，即使接收者采用对发送者最有利的近似最优响应策略，也不存在任何信号方案能使发送者获得远超经典模型最优效用的收益。综合(1)和(2)可知，在贝叶斯说服问题中，接收者的近似最优响应行为对发送者最大可达效用的影响有限。这两个结果的证明均依赖于贝叶斯说服方案的鲁棒化思想：给定一个发送者信号方案与接收者策略对，我们可以构造另一个信号方案，使得接收者更倾向于在新方案中采用该策略，同时两方案赋予发送者相近的效用。作为主结果(1)的应用，我们证明：在接收者通过某些算法学习响应发送者的重复贝叶斯说服模型中，发送者仍能取得与经典模型几乎相当的表现。有趣的是，与(2)不同，当接收者具备学习能力时，发送者有时可获得远超经典模型的收益。