Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,\delta)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $\delta$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $\delta=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in $W_\infty$ distance. We show that by combining our new techniques with a localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.
翻译:后验采样,即从后验分布中采样的指数机制,可提供$\varepsilon$-纯差分隐私(DP)保证,且不受$(\varepsilon,\delta)$-近似DP所引入的潜在无界隐私泄露问题影响。然而在实际应用中,需使用马尔可夫链蒙特卡洛(MCMC)等近似采样方法,这重新将不受欢迎的$\delta$-近似误差引入隐私保证中。为弥合这一差距,我们提出近似采样扰动(简称ASAP)算法,该算法对MCMC样本施加与其从满足纯DP或纯高斯DP(即$\delta=0$)的参考分布的Wasserstein无穷范数($W_\infty$)距离成比例的噪声进行扰动。我们进一步利用Metropolis-Hastings算法生成样本,并证明该算法能在$W_\infty$距离下收敛。研究表明,通过将新技术与局部化步骤相结合,我们首次获得在强凸且光滑损失函数的DP-ERM问题中达到最优速率的近线性时间算法。