Strategic information disclosure, in its simplest form, considers a game between an information provider (sender) who has access to some private information that an information receiver is interested in. While the receiver takes an action that affects the utilities of both players, the sender can design information (or modify beliefs) of the receiver through signal commitment, hence posing a Stackelberg game. However, obtaining a Stackelberg equilibrium for this game traditionally requires the sender to have access to the receiver's objective. In this work, we consider an online version of information design where a sender interacts with a receiver of an unknown type who is adversarially chosen at each round. Restricting attention to Gaussian prior and quadratic costs for the sender and the receiver, we show that $\mathcal{O}(\sqrt{T})$ regret is achievable with full information feedback, where $T$ is the total number of interactions between the sender and the receiver. Further, we propose a novel parametrization that allows the sender to achieve $\mathcal{O}(\sqrt{T})$ regret for a general convex utility function. We then consider the Bayesian Persuasion problem with an additional cost term in the objective function, which penalizes signaling policies that are more informative and obtain $\mathcal{O}(\log(T))$ regret. Finally, we establish a sublinear regret bound for the partial information feedback setting and provide simulations to support our theoretical results.
翻译:策略性信息披露,在其最简形式中,考虑了信息提供者(发送方)与信息接收者之间的博弈,发送方拥有接收方感兴趣的私人信息。接收方采取的行动会影响双方的效用,而发送方可通过信号承诺设计信息(或修正接收方的信念),从而构成一个斯塔克尔伯格博弈。然而,求解此类博弈的斯塔克尔伯格均衡传统上要求发送方知晓接收方的目标函数。本文考虑信息设计的在线版本,其中发送方与每轮对抗性选定的未知类型的接收方交互。在限定发送方与接收方服从高斯先验且成本为二次函数的情况下,我们表明在完全信息反馈下可实现$\mathcal{O}(\sqrt{T})$的遗憾值,其中$T$为发送方与接收方交互的总次数。此外,我们提出一种新颖的参数化方法,使得发送方对于一般凸效用函数同样能实现$\mathcal{O}(\sqrt{T})$的遗憾值。随后,我们考虑在目标函数中加入惩罚信息量更多信号的额外成本项后的贝叶斯说服问题,并得到$\mathcal{O}(\log(T))$的遗憾值。最后,我们建立了部分信息反馈设置下的次线性遗憾界,并通过仿真验证了理论结果。