Differential privacy (DP) is a mathematical privacy notion increasingly deployed across government and industry. With DP, privacy protections are probabilistic: they are bounded by the privacy budget parameter, $\epsilon$. Prior work in health and computational science finds that people struggle to reason about probabilistic risks. Yet, communicating the implications of $\epsilon$ to people contributing their data is vital to avoiding privacy theater -- presenting meaningless privacy protection as meaningful -- and empowering more informed data-sharing decisions. Drawing on best practices in risk communication and usability, we develop three methods to convey probabilistic DP guarantees to end users: two that communicate odds and one offering concrete examples of DP outputs. We quantitatively evaluate these explanation methods in a vignette survey study ($n=963$) via three metrics: objective risk comprehension, subjective privacy understanding of DP guarantees, and self-efficacy. We find that odds-based explanation methods are more effective than (1) output-based methods and (2) state-of-the-art approaches that gloss over information about $\epsilon$. Further, when offered information about $\epsilon$, respondents are more willing to share their data than when presented with a state-of-the-art DP explanation; this willingness to share is sensitive to $\epsilon$ values: as privacy protections weaken, respondents are less likely to share data.
翻译:差分隐私是一种数学隐私概念,正越来越多地应用于政府与工业领域。其隐私保护具有概率性特征,通过隐私预算参数ε进行约束。此前健康科学和计算科学领域的研究发现,人们难以理解概率风险的推理过程。然而,向数据贡献者清晰传达ε参数的实际影响至关重要——这既能避免"隐私剧院"现象(将无实质意义的隐私保护伪装成有效保护),也能帮助用户做出更明智的数据共享决策。基于风险传播与可用性领域的最佳实践,我们开发了三种向终端用户传达差分隐私概率性保障的方法:两种采用概率比表述,一种提供差分隐私输出的具体示例。我们通过情境问卷研究(n=963)从三个维度对这些解释方法进行量化评估:客观风险理解能力、主观隐私理解程度及自我效能感。研究结果表明,基于概率比的解释方法相较于(1)基于输出的方法和(2)当前回避ε参数信息的最新技术方法,具有更优效果。进一步研究发现,当向受访者提供ε参数信息时,其数据共享意愿显著高于采用最新技术解释方案的情况;这种共享意愿与ε值高度相关——随着隐私保护强度的降低,受访者共享数据的意愿显著下降。