Inferential Privacy: From Impossibility to Database Privacy

We investigate the possibility of guaranteeing inferential privacy for mechanisms that release useful information about some data containing sensitive information, denoted by $X$. We describe a general model of utility and privacy in which utility is achieved by disclosing the value of low-entropy features of $X$, while privacy is maintained by keeping high-entropy features of $X$ secret. Adopting this model, we prove that meaningful inferential privacy guarantees can be obtained, even though this is commonly considered to be impossible by the well-known result of Dwork and Naor. Then, we specifically discuss a privacy measure called pointwise maximal leakage (PML) whose guarantees are of the inferential type. We use PML to show that differential privacy admits an inferential formulation: it describes the information leaking about a single entry in a database assuming that every other entry is known, and considering the worst-case distribution on the data. Moreover, we define inferential instance privacy (IIP) as a bound on the (non-conditional) information leaking about a single entry in the database under the worst-case distribution, and show that it is equivalent to free-lunch privacy. Overall, our approach to privacy unifies, formalizes, and explains many existing ideas, e.g., why the informed adversary assumption may lead to underestimating the information leaking about each entry in the database. Furthermore, insights obtained from our results suggest general methods for improving privacy analyses; for example, we argue that smaller privacy parameters can be obtained by excluding low-entropy prior distributions from protection.

翻译：我们研究了保证推断隐私的可能性，针对那些发布包含敏感信息的数据（记为$X$）有用信息的机制。我们描述了一个通用的效用与隐私模型，其中效用通过披露$X$的低熵特征值实现，而隐私通过保持$X$的高熵特征秘密来维护。采用该模型，我们证明可以获得有意义的推断隐私保证，尽管根据Dwork和Naor的著名结果，这通常被认为是不可能的。随后，我们具体讨论了一种称为逐点最大泄露（PML）的隐私度量，其保证属于推断类型。我们利用PML表明差分隐私具有推断形式：它描述了在假设每个其他条目已知且考虑最坏情况数据分布时，关于数据库中单个条目的信息泄露。此外，我们将推断实例隐私（IIP）定义为在最坏情况分布下关于数据库中单个条目的（非条件）信息泄露的界限，并证明其等价于免费午餐隐私。总体而言，我们的隐私方法统一、形式化并解释了许多现有概念，例如，为何知情对手假设可能导致低估关于数据库中每个条目泄露的信息。此外，从我们的结果中获得的见解提出了改进隐私分析的通用方法；例如，我们论证可以通过将低熵先验分布排除在保护之外来获得更小的隐私参数。