The Normal Distributions Indistinguishability Spectrum and its Application to Privacy-Preserving Machine Learning

We investigate the privacy of {\em any} algorithm whose outputs have Gaussian distribution. This work is motivated by the prevalence of such algorithms in several useful (ML) applications, and the comparatively little research that focuses on privacy-preserving learning outside of adding Gaussian noise to the data (such as DP-SGD). {\em What is the DP of any algorithm with multivariate Gaussian output?} We answer the above research question with a general lemma which we call {\em Normal Distributions Indistinguishability Spectrum} (NDIS), a closed-form analytic computation of the hockey-stick divergence $δ$ between an arbitrary pair of multivariate Gaussians, parameterized by privacy parameter $ε$. To show its practical implications, we prove several properties of our NDIS lemma. These properties form a {\em toolbox} of results which lead to potentially {\em easier} privacy proofs for any Gaussian-output algorithm. As an example application of our toolbox, we prove a tighter parametrisation of the privacy of {\em random projection (RP)}, and obtaining from it a more noise-frugal DP mechanism. Beyond random projection, NDIS can be used to lift {\em any} Gaussian-output algorithm with a `sensitivity' (which we define) to a Gaussian-output DP mechanism. The mechanism boosts the existing randomness in the algorithm, so that one can describe the mechanism's privacy as the IS between a single pair of Gaussians, which can then be analyzed via NDIS. Lastly, we leverage the connections between NDIS and the CDF of the generalized $χ^2$ distribution (which have efficient empirical estimators) to present a tool for white-box auditing of Gaussian-output algorithms.

翻译：我们研究输出服从高斯分布的任意算法的隐私性。此项工作的动机源于这类算法在若干机器学习应用中的普遍存在，以及相较于向数据添加高斯噪声（如DP-SGD）的方法，针对隐私保护学习的专门研究相对较少。**任意具有多元高斯输出的算法的差分隐私（DP）参数如何确定？** 我们通过一个称为"正态分布不可区分性谱"（NDIS）的通用引理回答了上述研究问题——该引理给出了任意一对多元高斯分布之间由隐私参数ε参数化的冰球棒散度δ的闭式解析计算。为展示其实用价值，我们证明了NDIS引理的若干性质。这些性质构成一套**工具集**，可潜在地简化任意高斯输出算法的隐私证明。作为工具集的应用实例，我们证明了随机投影（RP）隐私性的更紧参数化，并由此得到一种更节省噪声的差分隐私机制。除随机投影外，NDIS可将任意具有"灵敏度"（我们定义的概念）的高斯输出算法提升为高斯输出差分隐私机制。该机制增强算法中已有的随机性，使得机制的隐私性可通过单对高斯分布间的不可区分性（IS）描述，进而通过NDIS进行分析。最后，我们利用NDIS与广义χ²分布累积分布函数（其具有高效经验估计器）之间的关联，提出一种用于白盒审计高斯输出算法的工具。