Differentially Private Distributed Estimation and Learning

We study distributed estimation and learning problems in a networked environment in which agents exchange information to estimate unknown statistical properties of random variables from their privately observed samples. By exchanging information about their private observations, the agents can collectively estimate the unknown quantities, but they also face privacy risks. The goal of our aggregation schemes is to combine the observed data efficiently over time and across the network, while accommodating the privacy needs of the agents and without any coordination beyond their local neighborhoods. Our algorithms enable the participating agents to estimate a complete sufficient statistic from private signals that are acquired offline or online over time, and to preserve the privacy of their signals and network neighborhoods. This is achieved through linear aggregation schemes with adjusted randomization schemes that add noise to the exchanged estimates subject to differential privacy (DP) constraints. In every case, we demonstrate the efficiency of our algorithms by proving convergence to the estimators of a hypothetical, omniscient observer that has central access to all of the signals. We also provide convergence rate analysis and finite-time performance guarantees and show that the noise that minimizes the convergence time to the best estimates is the Laplace noise, with parameters corresponding to each agent's sensitivity to their signal and network characteristics. Finally, to supplement and validate our theoretical results, we run experiments on real-world data from the US Power Grid Network and electric consumption data from German Households to estimate the average power consumption of power stations and households under all privacy regimes.

翻译：我们研究网络化环境中的分布式估计与学习问题，在该环境下智能体通过交换信息，从各自私下观测的样本中估计随机变量的未知统计特性。通过交换关于私人观测的信息，智能体可以集体估计未知量，但也面临隐私风险。我们的聚合方案旨在随时间推移和网络范围内高效组合观测数据，同时满足智能体的隐私需求，并且无需超出其局部邻域的任何协调。我们的算法使参与智能体能够从离线或在线获取的私人信号中估计完备充分统计量，并保护其信号和网络邻域的隐私。这是通过线性聚合方案以及调整随机化方案实现的，该方案根据差分隐私（DP）约束向交换的估计值中添加噪声。在每种情况下，我们通过证明算法收敛到一个假设的、能够集中访问所有信号的全知观测器的估计值，来展示我们算法的效率。我们还提供了收敛速率分析和有限时间性能保证，并表明最小化收敛时间以获得最佳估计值的噪声是拉普拉斯噪声，其参数对应于每个智能体对其信号和网络特性的敏感度。最后，为补充和验证我们的理论结果，我们在来自美国电网的真实数据和德国居民家庭的用电消耗数据上进行了实验，以估计所有隐私制度下发电站和家庭的平均用电消耗。