We consider the problem of collaborative personalized mean estimation under a privacy constraint in an environment of several agents continuously receiving data according to arbitrary unknown agent-specific distributions. In particular, we provide a method based on hypothesis testing coupled with differential privacy and data variance estimation. Two privacy mechanisms and two data variance estimation schemes are proposed, and we provide a theoretical convergence analysis of the proposed algorithm for any bounded unknown distributions on the agents' data, showing that collaboration provides faster convergence than a fully local approach where agents do not share data. Moreover, we provide analytical performance curves for the case with an oracle class estimator, i.e., the class structure of the agents, where agents receiving data from distributions with the same mean are considered to be in the same class, is known. The theoretical faster-than-local convergence guarantee is backed up by extensive numerical results showing that for a considered scenario the proposed approach indeed converges much faster than a fully local approach, and performs comparably to ideal performance where all data is public. This illustrates the benefit of private collaboration in an online setting.
翻译:我们研究了在多个智能体环境中,各智能体根据任意未知的特定分布持续接收数据时,带有隐私约束的协作式个性化均值估计问题。具体而言,我们提出了一种基于假设检验并结合差分隐私与数据方差估计的方法。本文提出了两种隐私机制与两种数据方差估计方案,并对所提算法在智能体数据服从任意有界未知分布的情况下进行了理论收敛性分析,证明协作方式比完全不共享数据的完全本地化方法具有更快的收敛速度。此外,我们针对具有预言机类别估计器(即智能体的类别结构已知,且从具有相同均值分布接收数据的智能体被视为同一类别)的情况给出了解析性能曲线。理论上的"快于本地收敛"保证得到了大量数值实验的验证:在考虑的场景中,所提方法确实比完全本地化方法收敛快得多,且性能与所有数据公开的理想情况相当。这说明了在线场景中隐私协作的优越性。