The use of formal privacy to protect the confidentiality of responses in the 2020 Decennial Census of Population and Housing has triggered renewed interest and debate over how to measure the disclosure risks and societal benefits of the published data products. Following long-established precedent in economics and statistics, we argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria. Such criteria should be used to compare methodologies to identify those with the most desirable properties. We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework, the counterfactual framework underlying differential privacy, and prior-to-posterior comparisons. We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most while absolute disclosure risk satisfies the fewest. Furthermore, we explain that many of the criticisms levied against differential privacy would be levied against any technology that is not equivalent to direct, unrestricted access to confidential data. Thus, more research is needed, but in the near-term, the counterfactual approach appears best-suited for privacy-utility analysis.
翻译:使用形式化隐私保护技术保护2020年人口住房普查中应答者机密性,重新激发了关于如何衡量已发布数据产品的披露风险与社会效益的讨论与争议。遵循经济学和统计学领域的长期先例,我们主张任何量化披露风险的方案都应基于预先指定的客观标准。此类标准应用于比较不同方法论,以识别具有最理想特性的方法。我们通过简单需求集展示这一评估路径,分别评价了绝对披露风险框架、差分隐私所依赖的反事实框架以及先验-后验比较法。研究结论表明,满足所有需求指标是不可能的,但反事实比较法满足的需求最多,而绝对披露风险框架满足的需求最少。此外,我们阐明,针对差分隐私的诸多批评同样适用于任何不等同于直接、无限制访问机密数据的技术。因此,虽需进一步研究,但短期内反事实方法似乎是隐私-效用分析的最佳选择。