The development of Internet technology enables an analysis on the whole population rather than a certain number of samples, and leads to increasing requirement for privacy protection. Local differential privacy (LDP) is an effective standard of privacy measurement; however, its large variance of mean estimation causes challenges in application. To address this problem, this paper presents a new LDP approach, an improved Christofides mechanism. It compared four statistical survey methods for conducting surveys on sensitive topics -- modified Warner, Simmons, Christofides, and the improved Christofides mechanism. Specifically, Warner, Simmons and Christofides mechanisms have been modified to draw a sample from the population without replacement, to decrease variance. Furthermore, by drawing cards without replacement based on modified Christofides mechanism, we introduce a new mechanism called the improved Christofides mechanism, which is found to have the smallest variance under certain assumption when using LDP as a measurement of privacy leakage. The assumption is do satisfied usually in the real world. Actually, we decrease the variance to 28.7% of modified Christofides mechanism's variance in our experiment based on the HCOVANY dataset -- a real world dataset of IPUMS USA. This means our method gets a more accurate estimate by using LDP as a measurement of privacy leakage. This is the first time the improved Christofides mechanism is proposed for LDP framework based on comparative analysis of four mechanisms using LDP as the same measurement of privacy leakage.
翻译:互联网技术的发展使得对整体人群而非特定样本量的分析成为可能,并引发了对隐私保护的更高需求。本地差分隐私(LDP)是一种有效的隐私度量标准,但其均值估计的大方差特性给实际应用带来了挑战。为解决这一问题,本文提出了一种新的LDP方法——改进的Christofides机制。我们比较了四种针对敏感话题的统计调查方法:改进的Warner机制、Simmons机制、Christofides机制以及改进的Christofides机制。具体而言,我们对Warner、Simmons和Christofides机制进行了改进,通过无放回抽样从总体中抽取样本以降低方差。此外,基于改进的Christofides机制的无放回抽牌方法,我们提出了一个新机制——改进的Christofides机制。在采用LDP作为隐私泄露度量时,该机制在特定假设下具有最小方差,而该假设在实际场景中通常成立。基于IPUMS USA的真实数据集HCOVANY进行的实验显示,改进后机制的方差仅为改进Christofides机制方差的28.7%,这意味着我们的方法在使用LDP作为隐私泄露度量时能获得更准确的估计。这是首次通过四种机制使用相同隐私泄露度量LDP的对比分析,在LDP框架中提出改进的Christofides机制。