Disclosure avoidance (DA) systems are used to safeguard the confidentiality of data while allowing it to be analyzed and disseminated for analytic purposes. These methods, e.g., cell suppression, swapping, and k-anonymity, are commonly applied and may have significant societal and economic implications. However, a formal analysis of their privacy and bias guarantees has been lacking. This paper presents a framework that addresses this gap: it proposes differentially private versions of these mechanisms and derives their privacy bounds. In addition, the paper compares their performance with traditional differential privacy mechanisms in terms of accuracy and fairness on US Census data release and classification tasks. The results show that, contrary to popular beliefs, traditional differential privacy techniques may be superior in terms of accuracy and fairness to differential private counterparts of widely used DA mechanisms.
翻译:披露避免(Disclosure Avoidance, DA)系统用于保护数据的机密性,同时允许数据被分析并用于分析目的的发布。这些方法(例如单元格抑制、交换和k-匿名性)被广泛应用,可能具有显著的社会和经济影响。然而,对其隐私性和偏差保证的正式分析一直缺乏。本文提出了一个框架来填补这一空白:它提出了这些机制的差分隐私版本,并推导了它们的隐私界限。此外,论文还在美国人口普查数据发布和分类任务上,从准确性和公平性角度比较了这些机制与传统差分隐私机制的性能。结果表明,与普遍看法相反,传统差分隐私技术可能在准确性和公平性上优于广泛使用的DA机制的差分隐私对应版本。