We study the relationship between two desiderata of algorithms in statistical inference and machine learning: differential privacy and robustness to adversarial data corruptions. Their conceptual similarity was first observed by Dwork and Lei (STOC 2009), who observed that private algorithms satisfy robustness, and gave a general method for converting robust algorithms to private ones. However, all general methods for transforming robust algorithms into private ones lead to suboptimal error rates. Our work gives the first black-box transformation that converts any adversarially robust algorithm into one that satisfies pure differential privacy. Moreover, we show that for any low-dimensional estimation task, applying our transformation to an optimal robust estimator results in an optimal private estimator. Thus, we conclude that for any low-dimensional task, the optimal error rate for $\varepsilon$-differentially private estimators is essentially the same as the optimal error rate for estimators that are robust to adversarially corrupting $1/\varepsilon$ training samples. We apply our transformation to obtain new optimal private estimators for several high-dimensional tasks, including Gaussian (sparse) linear regression and PCA. Finally, we present an extension of our transformation that leads to approximate differentially private algorithms whose error does not depend on the range of the output space, which is impossible under pure differential privacy.
翻译:我们研究了统计推断与机器学习中算法两个期望性质之间的关系:差分隐私与对对抗性数据破坏的鲁棒性。Dwork 和 Lei(STOC 2009)首次观察到两者概念上的相似性,指出隐私算法满足鲁棒性,并给出了一种将鲁棒算法转化为隐私算法的通用方法。然而,所有将鲁棒算法转化为隐私算法的通用方法都会导致次优的错误率。我们的工作首次提出了一个黑盒转换方法,可以将任何对抗鲁棒算法转换为满足纯差分隐私的算法。此外,我们证明对于任何低维估计任务,将我们的转换应用于最优鲁棒估计器会得到最优隐私估计器。因此,我们得出结论:对于任何低维任务,$\varepsilon$-差分隐私估计器的最优错误率本质上与对 $1/\varepsilon$ 个训练样本进行对抗性破坏的鲁棒估计器的最优错误率相同。我们将此转换应用于多个高维任务,包括高斯(稀疏)线性回归和主成分分析,获得了新的最优隐私估计器。最后,我们提出了转换的扩展版本,可得到近似差分隐私算法,其错误不依赖于输出空间的范围,这在纯差分隐私下是不可能的。