This paper investigates the utility gain of using Iterative Bayesian Update (IBU) for private discrete distribution estimation using data obfuscated with Locally Differentially Private (LDP) mechanisms. We compare the performance of IBU to Matrix Inversion (MI), a standard estimation technique, for seven LDP mechanisms designed for one-time data collection and for other seven LDP mechanisms designed for multiple data collections (e.g., RAPPOR). To broaden the scope of our study, we also varied the utility metric, the number of users n, the domain size k, and the privacy parameter {\epsilon}, using both synthetic and real-world data. Our results suggest that IBU can be a useful post-processing tool for improving the utility of LDP mechanisms in different scenarios without any additional privacy cost. For instance, our experiments show that IBU can provide better utility than MI, especially in high privacy regimes (i.e., when {\epsilon} is small). Our paper provides insights for practitioners to use IBU in conjunction with existing LDP mechanisms for more accurate and privacy-preserving data analysis. Finally, we implemented IBU for all fourteen LDP mechanisms into the state-of-the-art multi-freq-ldpy Python package (https://pypi.org/project/multi-freq-ldpy/) and open-sourced all our code used for the experiments as tutorials.
翻译:本文研究了使用迭代贝叶斯更新(IBU)对经本地差分隐私(LDP)机制混淆的数据进行私有离散分布估计的效用增益。我们将IBU与标准估计技术矩阵求逆(MI)进行了性能比较,涵盖七种针对单次数据收集设计的LDP机制以及七种针对多次数据收集设计的LDP机制(例如RAPPOR)。为拓展研究范围,我们还改变效用度量、用户数量n、域大小k及隐私参数{\epsilon},并采用合成数据与真实数据开展实验。结果表明,IBU可作为有效的后处理工具,在不同场景下提升LDP机制的效用且无需额外隐私成本。例如,实验显示IBU在高度隐私保护场景(即{\epsilon}较小时)能比MI提供更优效用。本文为实践者在现有LDP机制中结合使用IBU以实现更准确且保护隐私的数据分析提供了见解。最后,我们已将IBU应用于所有十四种LDP机制,并集成为当前最先进的multi-freq-ldpy Python包(https://pypi.org/project/multi-freq-ldpy/),同时将实验所用全部代码作为教程开源。