We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability. The commonly adopted compression schemes introduce information loss into local data while improving communication efficiency, and it remains an open problem whether such discrete-valued mechanisms provide any privacy protection. In this paper, we study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP). More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms, including the binomial noise and the binomial mechanisms that are proposed for privacy preservation, and the sign-based methods that are proposed for data compression, in closed-form expressions. We further investigate the amplification in privacy by sparsification and propose a ternary stochastic compressor. By leveraging compression for privacy amplification, we improve the existing methods by removing the dependency of accuracy (in terms of mean square error) on communication cost in the popular use case of distributed mean estimation, therefore breaking the three-way tradeoff between privacy, communication, and accuracy. Finally, we discuss the Byzantine resilience of the proposed mechanism and its application in federated learning.
翻译:本文考虑一个联邦数据分析问题,其中服务器协调多个具有隐私顾虑和有限通信能力的用户进行协作数据分析。常用的压缩方案在提升通信效率的同时会导致本地数据的信息损失,这类离散值机制是否能提供任何隐私保护仍是一个开放问题。本文通过$f$-差分隐私(DP)的视角研究具有有限输出空间的离散值机制的本地差分隐私保证。具体而言,我们在现有文献基础上取得以下进展:推导出多种离散值机制的紧致$f$-DP保证(包括为隐私保护提出的二项噪声和二项机制,以及为数据压缩提出的基于符号方法),并以闭式表达式呈现。我们进一步研究了稀疏化对隐私的放大效应,并提出了一种三元随机压缩器。通过利用压缩实现隐私放大,我们在分布式均值估计这一常见用例中改进了现有方法,消除了精度(以均方误差衡量)对通信成本的依赖性,从而打破了隐私、通信和精度之间的三方权衡。最后,我们讨论了所提机制的拜占庭鲁棒性及其在联邦学习中的应用。