Privacy of the last iterate in cyclically-sampled DP-SGD on nonconvex composite losses

Differentially private stochastic gradient descent (DP-SGD) refers to a family of optimization algorithms that provide a guaranteed level of differential privacy (DP) through DP accounting techniques. However, current accounting techniques make assumptions that diverge significantly from practical DP-SGD implementations. For example, they may assume the loss function is Lipschitz continuous and convex, sample the batches randomly with replacement, or omit the gradient clipping step. In this work, we analyze the most commonly used variant of DP-SGD, in which we sample batches cyclically with replacement, perform gradient clipping, and only release the last DP-SGD iterate. More specifically - without assuming convexity, smoothness, or Lipschitz continuity of the loss function - we establish new R\'enyi differential privacy (RDP) bounds for the last DP-SGD iterate under the mild assumption that (i) the DP-SGD stepsize is small relative to the topological constants in the loss function, and (ii) the loss function is weakly-convex. Moreover, we show that our bounds converge to previously established convex bounds when the weak-convexity parameter of the objective function approaches zero. In the case of non-Lipschitz smooth loss functions, we provide a weaker bound that scales well in terms of the number of DP-SGD iterations.

翻译：差分隐私随机梯度下降（DP-SGD）是指通过差分隐私核算技术提供可保证隐私水平的优化算法族。然而，当前核算技术的假设与实际DP-SGD实现存在显著差异。例如，这些假设可能要求损失函数满足Lipschitz连续性与凸性、采用随机有放回批次采样，或忽略梯度裁剪步骤。本研究分析了最常用的DP-SGD变体：该变体采用循环有放回批次采样，执行梯度裁剪，且仅发布最终DP-SGD迭代结果。具体而言——在不假设损失函数具有凸性、光滑性或Lipschitz连续性的前提下——我们在以下温和假设下为最终DP-SGD迭代建立了新的Rényi差分隐私（RDP）边界：（i）DP-SGD步长相对于损失函数的拓扑常数较小；（ii）损失函数具有弱凸性。此外，我们证明当目标函数的弱凸性参数趋近于零时，所得边界会收敛至先前建立的凸函数边界。对于非Lipschitz光滑损失函数，我们给出了随DP-SGD迭代次数具有良好缩放性的较弱边界。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日