Pointwise maximal leakage (PML) is a per-outcome privacy measure based on threat models from quantitative information flow. Privacy guarantees with PML rely on knowledge about the distribution that generated the private data. In this work, we propose a framework for PML privacy assessment and mechanism design with empirical estimates of this data-generating distribution. By extending the PML framework to consider sets of data-generating distributions, we arrive at bounds on the worst-case leakage within a given set. We use these bounds alongside large-deviation bounds from the literature to provide a method for obtaining distribution-independent $(\varepsilon,\delta)$-PML guarantees when the data-generating distribution is estimated from available data samples. We provide an optimal binary mechanism, and show that mechanism design with this type of uncertainty about the data-generating distribution reduces to a linearly constrained convex program. Further, we show that optimal mechanisms designed for a distribution estimate can be used. Finally, we apply these tools to leakage assessment of the Laplace mechanism and the Gaussian mechanism for binary private data, and numerically show that the presented approach to mechanism design can yield significant utility increase compared to local differential privacy, while retaining similar privacy guarantees.
翻译:逐点最大泄漏(PML)是一种基于定量信息流威胁模型的逐结果隐私度量。基于PML的隐私保证依赖于对生成私有数据的分布的认知。在本工作中,我们提出了一个PML隐私评估与机制设计框架,该框架使用对此数据生成分布的经验估计。通过扩展PML框架以考虑数据生成分布的集合,我们得出了给定集合内最坏情况泄漏的界限。我们利用这些界限以及文献中的大偏差界限,提供了一种方法,用于在数据生成分布是从可用数据样本估计得到时,获得与分布无关的$(\varepsilon,\delta)$-PML保证。我们提供了一个最优的二元机制,并证明了在这种对数据生成分布存在不确定性的情况下进行机制设计,可简化为一个线性约束的凸规划问题。此外,我们证明了为分布估计设计的最优机制可以被使用。最后,我们将这些工具应用于二元私有数据的拉普拉斯机制和高斯机制的泄漏评估,并通过数值实验表明,与局部差分隐私相比,所提出的机制设计方法在保持相似隐私保证的同时,可以显著提高效用。