We study distributed estimation and learning problems in a networked environment in which agents exchange information to estimate unknown statistical properties of random variables from their privately observed samples. By exchanging information about their private observations, the agents can collectively estimate the unknown quantities, but they also face privacy risks. The goal of our aggregation schemes is to combine the observed data efficiently over time and across the network, while accommodating the privacy needs of the agents and without any coordination beyond their local neighborhoods. Our algorithms enable the participating agents to estimate a complete sufficient statistic from private signals that are acquired offline or online over time, and to preserve the privacy of their signals and network neighborhoods. This is achieved through linear aggregation schemes with adjusted randomization schemes that add noise to the exchanged estimates subject to differential privacy (DP) constraints. In every case, we demonstrate the efficiency of our algorithms by proving convergence to the estimators of a hypothetical, omniscient observer that has central access to all of the signals. We also provide convergence rate analysis and finite-time performance guarantees and show that the noise that minimizes the convergence time to the best estimates is the Laplace noise, with parameters corresponding to each agent's sensitivity to their signal and network characteristics. Finally, to supplement and validate our theoretical results, we run experiments on real-world data from the US Power Grid Network and electric consumption data from German Households to estimate the average power consumption of power stations and households under all privacy regimes.
翻译:我们研究了网络环境中的分布式估计与学习问题,其中智能体通过交换信息,根据其私密观测样本估计随机变量的未知统计特性。通过交换关于其私密观测的信息,智能体可以集体估计未知量,但同时也面临隐私风险。我们的聚合方案旨在随着时间推移和跨网络高效地组合观测数据,同时满足智能体的隐私需求,且无需超越其局部邻域的任何协调。我们的算法使参与的智能体能够从离线或随时间在线获取的私密信号中估计完整的充分统计量,并保护其信号和网络邻域的隐私。这是通过线性聚合方案结合调整后的随机化方案实现的,该方案在满足差分隐私约束的条件下向交换的估计值添加噪声。在每种情况下,我们通过证明算法收敛到假设的全知观测者(能够集中访问所有信号)的估计量,来展示算法的效率。我们还提供了收敛速度分析和有限时间性能保证,并表明最小化收敛至最佳估计时间的最优噪声是拉普拉斯噪声,其参数对应每个智能体对其信号和网络特性的敏感度。最后,为补充和验证我们的理论结果,我们在美国电网的真实数据以及德国居民用电数据上进行了实验,在所有隐私机制下估计发电站和居民的平均用电量。