Machine learning models have shone in a variety of domains and attracted increasing attention from both the security and the privacy communities. One important yet worrying question is: Will training models under the differential privacy (DP) constraint have an unfavorable impact on their adversarial robustness? While previous works have postulated that privacy comes at the cost of worse robustness, we give the first theoretical analysis to show that DP models can indeed be robust and accurate, even sometimes more robust than their naturally-trained non-private counterparts. We observe three key factors that influence the privacy-robustness-accuracy tradeoff: (1) hyper-parameters for DP optimizers are critical; (2) pre-training on public data significantly mitigates the accuracy and robustness drop; (3) choice of DP optimizers makes a difference. With these factors set properly, we achieve 90\% natural accuracy, 72\% robust accuracy ($+9\%$ than the non-private model) under $l_2(0.5)$ attack, and 69\% robust accuracy ($+16\%$ than the non-private model) with pre-trained SimCLRv2 model under $l_\infty(4/255)$ attack on CIFAR10 with $\epsilon=2$. In fact, we show both theoretically and empirically that DP models are Pareto optimal on the accuracy-robustness tradeoff. Empirically, the robustness of DP models is consistently observed across various datasets and models. We believe our encouraging results are a significant step towards training models that are private as well as robust.
翻译:机器学习模型在多个领域表现出色,并日益引起安全与隐私社区的关注。一个重要且令人担忧的问题是:在差分隐私(DP)约束下训练模型是否会对它们的对抗鲁棒性产生不利影响?尽管先前的研究认为隐私会以牺牲鲁棒性为代价,但我们首次通过理论分析证明,DP模型确实能够兼具鲁棒性和准确性,有时甚至比自然训练的非隐私模型更加鲁棒。我们观察到影响隐私-鲁棒性-准确性权衡的三个关键因素:(1)DP优化器的超参数至关重要;(2)在公共数据上进行预训练能显著缓解准确性和鲁棒性的下降;(3)DP优化器的选择会产生显著影响。当这些因素设置适当时,我们在CIFAR10数据集上(ε=2)实现了:在l₂(0.5)攻击下90%的自然准确率、72%的鲁棒准确率(比非隐私模型高9%),以及在使用预训练SimCLRv2模型时l∞(4/255)攻击下69%的鲁棒准确率(比非隐私模型高16%)。事实上,我们从理论和实证两方面证明,DP模型在准确-鲁棒权衡中达到了帕累托最优。实证结果表明,DP模型的鲁棒性在各种数据集和模型上均保持一致。我们相信,这一令人鼓舞的结果是向训练兼具隐私性和鲁棒性的模型迈出的重要一步。