We consider the problem of collaborative personalized mean estimation under a privacy constraint in an environment of several agents continuously receiving data according to arbitrary unknown agent-specific distributions. In particular, we provide a method based on hypothesis testing coupled with differential privacy and data variance estimation. Two privacy mechanisms and two data variance estimation schemes are proposed, and we provide a theoretical convergence analysis of the proposed algorithm for any bounded unknown distributions on the agents' data, showing that collaboration provides faster convergence than a fully local approach where agents do not share data. Moreover, we provide analytical performance curves for the case with an oracle class estimator, i.e., the class structure of the agents, where agents receiving data from distributions with the same mean are considered to be in the same class, is known. The theoretical faster-than-local convergence guarantee is backed up by extensive numerical results showing that for a considered scenario the proposed approach indeed converges much faster than a fully local approach, and performs comparably to ideal performance where all data is public. This illustrates the benefit of private collaboration in an online setting.
翻译:我们研究在多个智能体环境中,在隐私约束下进行协作式个性化均值估计的问题,其中各智能体持续接收来自任意未知的智能体特定分布的数据。具体而言,我们提出了一种基于假设检验并结合差分隐私与数据方差估计的方法。本文提出了两种隐私机制和两种数据方差估计方案,并对所提算法在智能体数据服从任意有界未知分布的情况下进行了理论收敛性分析,证明协作方法比完全不共享数据的完全本地化方法具有更快的收敛速度。此外,我们针对具有预言机类别估计器的情况给出了解析性能曲线,即当智能体的类别结构(接收相同均值分布数据的智能体被视为同一类别)已知时的理想情形。理论上的"快于本地收敛"保证得到了大量数值实验的验证:在考虑的场景中,所提方法确实比完全本地化方法收敛快得多,且性能与所有数据公开的理想情况相当。这说明了在线场景中隐私协作的优越性。