In order to provide differential privacy, Gaussian noise with standard deviation $\sigma$ is added to local SGD updates after performing a clipping operation in Differential Private SGD (DP-SGD). By non-trivially improving the moment account method we prove a closed form $(\epsilon,\delta)$-DP guarantee: DP-SGD is $(\epsilon\leq 1/2,\delta=1/N)$-DP if $\sigma=\sqrt{2(\epsilon +\ln(1/\delta))/\epsilon}$ with $T$ at least $\approx 2k^2/\epsilon$ and $(2/e)^2k^2-1/2\geq \ln(N)$, where $T$ is the total number of rounds, and $K=kN$ is the total number of gradient computations where $k$ measures $K$ in number of epochs of size $N$ of the local data set. We prove that our expression is close to tight in that if $T$ is more than a constant factor $\approx 8$ smaller than the lower bound $\approx 2k^2/\epsilon$, then the $(\epsilon,\delta)$-DP guarantee is violated. Choosing the smallest possible value $T\approx 2k^2/\epsilon$ not only leads to a close to tight DP guarantee, but also minimizes the total number of communicated updates and this means that the least amount of noise is aggregated into the global model and in addition accuracy is optimized as confirmed by simulations.
翻译:为提供差分隐私,差分隐私随机梯度下降(DP-SGD)在执行裁剪操作后,向本地SGD更新中添加标准差为$\sigma$的高斯噪声。通过非平凡地改进矩会计方法,我们证明了一个闭式$(\epsilon,\delta)$-DP保证:当$\sigma=\sqrt{2(\epsilon +\ln(1/\delta))/\epsilon}$且$T$至少约为$2k^2/\epsilon$,同时$(2/e)^2k^2-1/2\geq \ln(N)$时,DP-SGD满足$(\epsilon\leq 1/2,\delta=1/N)$-DP,其中$T$为总轮数,$K=kN$为总梯度计算次数,$k$衡量$K$以大小为$N$的本地数据集轮数。我们证明该表达式接近紧致:若$T$小于下界$2k^2/\epsilon$一个常因子约8倍以上,则$(\epsilon,\delta)$-DP保证被违反。选择最小可能值$T\approx 2k^2/\epsilon$不仅得到接近紧致的DP保证,还能最小化总通信更新次数,这意味着全局模型中聚合的噪声最少,同时如模拟所证实,准确性也得到优化。