Privacy in AI remains a topic that draws attention from researchers and the general public in recent years. As one way to implement privacy-preserving AI, differentially private learning is a framework that enables AI models to use differential privacy (DP). To achieve DP in the learning process, existing algorithms typically limit the magnitude of gradients with a constant clipping, which requires carefully tuned due to its significant impact on model performance. As a solution to this issue, latest works NSGD and Auto-S innovatively propose to use normalization instead of clipping to avoid hyperparameter tuning. However, normalization-based approaches like NSGD and Auto-S rely on a monotonic weight function, which imposes excessive weight on small gradient samples and introduces extra deviation to the update. In this paper, we propose a Differentially Private Per-Sample Adaptive Clipping (DP-PSAC) algorithm based on a non-monotonic adaptive weight function, which guarantees privacy without the typical hyperparameter tuning process of using a constant clipping while significantly reducing the deviation between the update and true batch-averaged gradient. We provide a rigorous theoretical convergence analysis and show that with convergence rate at the same order, the proposed algorithm achieves a lower non-vanishing bound, which is maintained over training iterations, compared with NSGD/Auto-S. In addition, through extensive experimental evaluation, we show that DP-PSAC outperforms or matches the state-of-the-art methods on multiple main-stream vision and language tasks.
翻译:人工智能中的隐私问题近年来持续受到研究人员和公众的关注。作为实现隐私保护人工智能的一种方式,差分隐私学习是一种使人工智能模型能够应用差分隐私(DP)的框架。在学习过程中实现DP时,现有算法通常使用恒定裁剪来限制梯度的大小,但由于其对模型性能有显著影响,需要仔细调整参数。针对这一问题,最新研究NSGD和Auto-S创新性地提出使用归一化替代裁剪以避免超参数调整。然而,基于归一化的方法(如NSGD和Auto-S)依赖于单调权重函数,这会对小梯度样本施加过大权重,并给更新引入额外偏差。本文提出一种基于非单调自适应权重函数的差分隐私逐样本自适应裁剪(DP-PSAC)算法,该算法在无需使用恒定裁剪的典型超参数调整过程即可保证隐私,同时显著降低更新与真实批量平均梯度之间的偏差。我们提供了严格的理论收敛性分析,证明与NSGD/Auto-S相比,所提算法在相同阶数的收敛速度下实现了更低的非消失界,且该界在训练迭代过程中得以保持。此外,通过大量实验评估,我们证明DP-PSAC在多个主流视觉和语言任务上优于或持平现有最优方法。