Differentially Private Distributed Estimation and Learning

from arxiv, Additional experiments, comparison with related work, and extensions (dynamic networks, directed networks networks, heterogeneous privacy budgets)

We study distributed estimation and learning problems in a networked environment in which agents exchange information to estimate unknown statistical properties of random variables from their privately observed samples. The agents can collectively estimate the unknown quantities by exchanging information about their private observations, but they also face privacy risks. Our novel algorithms extend the existing distributed estimation literature and enable the participating agents to estimate a complete sufficient statistic from private signals acquired offline or online over time and to preserve the privacy of their signals and network neighborhoods. This is achieved through linear aggregation schemes with adjusted randomization schemes that add noise to the exchanged estimates subject to differential privacy (DP) constraints, both in an offline and online manner. We provide convergence rate analysis and tight finite-time convergence bounds. We show that the noise that minimizes the convergence time to the best estimates is the Laplace noise, with parameters corresponding to each agent's sensitivity to their signal and network characteristics. Our algorithms are further amenable to dynamic topologies and balancing privacy and accuracy trade-offs. Finally, to supplement and validate our theoretical results, we run experiments on real-world data from the US Power Grid Network and electric consumption data from German Households to estimate the average power consumption of power stations and households under all privacy regimes and show that our method outperforms existing first-order privacy-aware distributed optimization methods.

翻译：我们研究网络化环境中的分布式估计与学习问题，其中智能体通过信息交换来根据私有观测样本估计随机变量的未知统计特性。智能体可通过交换关于其私有观测的信息来共同估计未知量，但同时也面临隐私风险。我们提出的新算法扩展了现有分布式估计文献，使参与智能体能够从离线或在线获取的私有信号中估计完备充分统计量，并保护其信号及网络邻域隐私。这通过线性聚合方案与调整后的随机化方案实现，该方案在离线与在线两种方式下，向交换估计值添加服从差分隐私约束的噪声。我们提供了收敛速率分析与紧致的有限时间收敛界。研究表明，使估计值达到最优估计收敛时间最小的噪声为拉普拉斯噪声，其参数对应各智能体对信号及网络特征的敏感度。我们的算法进一步支持动态拓扑结构，并实现隐私与准确度的平衡。最后，为补充和验证理论结果，我们在美国电网网络的真实数据与德国家庭用电数据上开展实验，评估了所有隐私保护机制下电站与家庭平均功耗的估计效果，结果显示我们的方法优于现有的一阶隐私感知分布式优化方法。