We consider cross-silo federated linear contextual bandit (LCB) problem under differential privacy. In this setting, multiple silos or agents interact with the local users and communicate via a central server to realize collaboration while without sacrificing each user's privacy. We identify two issues in the state-of-the-art algorithm of \cite{dubey2020differentially}: (i) failure of claimed privacy protection and (ii) noise miscalculation in regret bound. To resolve these issues, we take a two-step principled approach. First, we design an algorithmic framework consisting of a generic federated LCB algorithm and flexible privacy protocols. Then, leveraging the proposed framework, we study federated LCBs under two different privacy constraints. We first establish privacy and regret guarantees under silo-level local differential privacy, which fix the issues present in state-of-the-art algorithm. To further improve the regret performance, we next consider shuffle model of differential privacy, under which we show that our algorithm can achieve nearly ``optimal'' regret without a trusted server. We accomplish this via two different schemes -- one relies on a new result on privacy amplification via shuffling for DP mechanisms and another one leverages the integration of a shuffle protocol for vector sum into the tree-based mechanism, both of which might be of independent interest. Finally, we support our theoretical results with numerical evaluations over contextual bandit instances generated from both synthetic and real-life data.
翻译:我们研究了差分隐私约束下的跨孤岛联邦线性上下文赌博机(LCB)问题。在该场景中,多个孤岛/智能体与本地用户交互,并通过中央服务器进行通信以实现协作,同时不牺牲每个用户的隐私。我们发现最新算法 \cite{dubey2020differentially} 存在两个问题:(i) 声称的隐私保护未能实现;(ii) 遗憾界中噪声计算错误。为解决这些问题,我们采用两步式系统化方法。首先,我们设计了一个通用算法框架,包含泛化联邦LCB算法与灵活隐私协议。接着,利用所提框架,我们研究了两种不同隐私约束下的联邦LCB问题。我们首先建立了孤岛级本地差分隐私下的隐私与遗憾保证,修正了现有算法中的缺陷。为进一步提升遗憾性能,我们进一步考虑了差分隐私的洗牌模型,在该模型下证明我们的算法无需可信服务器即可实现近乎"最优"的遗憾。我们通过两种不同方案实现这一目标:其一依赖于针对差分隐私机制的洗牌隐私放大新结果;其二则通过将用于向量求和的洗牌协议集成到基于树的机制中——这两种方法可能具有独立研究价值。最后,我们利用合成数据与真实数据生成的上下文赌博机实例进行数值评估,以支持理论结果。