Differentially Private Stochastic Gradient Descent with gradient clipping (DPSGD-GC) is a powerful tool for training deep learning models using sensitive data, providing both a solid theoretical privacy guarantee and high efficiency. However, using DPSGD-GC to ensure Differential Privacy (DP) comes at the cost of model performance degradation due to DP noise injection and gradient clipping. Existing research has extensively analyzed the theoretical convergence of DPSGD-GC, and has shown that it only converges when using large clipping thresholds that are dependent on problem-specific parameters. Unfortunately, these parameters are often unknown in practice, making it hard to choose the optimal clipping threshold. Therefore, in practice, DPSGD-GC suffers from degraded performance due to the {\it constant} bias introduced by the clipping. In our work, we propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC, which not only offers a diminishing utility bound without inducing a constant clipping bias, but more importantly, it allows for an arbitrary choice of clipping threshold that is independent of the problem. We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R{\'e}nyi DP. Additionally, we demonstrate that under mild conditions, our algorithm can achieve nearly the same utility bound as DPSGD without gradient clipping. Our empirical results on Cifar-10/100 and E2E datasets, show that the proposed algorithm achieves higher accuracies than DPSGD while maintaining the same level of DP guarantee.
翻译:具有梯度裁剪的差分隐私随机梯度下降(DPSGD-GC)是利用敏感数据训练深度学习模型的强大工具,既能提供坚实的理论隐私保障,又具备较高效率。然而,为确保差分隐私(DP)而使用DPSGD-GC会因注入DP噪声和梯度裁剪导致模型性能下降。现有研究已深入分析DPSGD-GC的理论收敛性,并表明该方法仅在使用依赖于问题特定参数的大裁剪阈值时才能收敛。不幸的是,这些参数在实践中往往未知,导致最优裁剪阈值难以选择。因此,实际应用中DPSGD-GC因裁剪引入的恒定偏差而性能受损。本文提出一种基于误差反馈(EF)的新型DP算法作为DPSGD-GC的替代方案,该算法不仅能获得衰减的效用界且不引入恒定裁剪偏差,更重要的是允许选择与问题无关的任意裁剪阈值。我们为所提算法建立了特定于算法的DP分析,基于Rényi DP提供隐私保障。此外,我们证明在温和条件下,该算法可实现与无梯度裁剪的DPSGD几乎相同的效用界。在Cifar-10/100和E2E数据集上的实验结果表明,所提算法在保持相同DP保障水平的同时,比DPSGD获得更高准确率。