Differentially Private Algorithms for the Stochastic Saddle Point Problem with Optimal Rates for the Strong Gap

We show that convex-concave Lipschitz stochastic saddle point problems (also known as stochastic minimax optimization) can be solved under the constraint of $(\epsilon,\delta)$-differential privacy with \emph{strong (primal-dual) gap} rate of $\tilde O\big(\frac{1}{\sqrt{n}} + \frac{\sqrt{d}}{n\epsilon}\big)$, where $n$ is the dataset size and $d$ is the dimension of the problem. This rate is nearly optimal, based on existing lower bounds in differentially private stochastic optimization. Specifically, we prove a tight upper bound on the strong gap via novel implementation and analysis of the recursive regularization technique repurposed for saddle point problems. We show that this rate can be attained with $O\big(\min\big\{\frac{n^2\epsilon^{1.5}}{\sqrt{d}}, n^{3/2}\big\}\big)$ gradient complexity, and $\tilde{O}(n)$ gradient complexity if the loss function is smooth. As a byproduct of our method, we develop a general algorithm that, given a black-box access to a subroutine satisfying a certain $\alpha$ primal-dual accuracy guarantee with respect to the empirical objective, gives a solution to the stochastic saddle point problem with a strong gap of $\tilde{O}(\alpha+\frac{1}{\sqrt{n}})$. We show that this $\alpha$-accuracy condition is satisfied by standard algorithms for the empirical saddle point problem such as the proximal point method and the stochastic gradient descent ascent algorithm. Further, we show that even for simple problems it is possible for an algorithm to have zero weak gap and suffer from $\Omega(1)$ strong gap. We also show that there exists a fundamental tradeoff between stability and accuracy. Specifically, we show that any $\Delta$-stable algorithm has empirical gap $\Omega\big(\frac{1}{\Delta n}\big)$, and that this bound is tight. This result also holds also more specifically for empirical risk minimization problems and may be of independent interest.

翻译：我们证明，在$(\epsilon,\delta)$-差分隐私约束下，凸-凹Lipschitz随机鞍点问题（亦称随机极小极大优化）可实现$\tilde O\big(\frac{1}{\sqrt{n}} + \frac{\sqrt{d}}{n\epsilon}\big)$的*强（原始-对偶）间隙*率，其中$n$为数据集规模，$d$为问题维度。基于差分隐私随机优化的现有下界，该速率接近最优。具体而言，我们通过针对鞍点问题重新设计的递归正则化技术的新颖实现与分析，证明了强间隙的紧上界。我们表明，该速率可通过$O\big(\min\big\{\frac{n^2\epsilon^{1.5}}{\sqrt{d}}, n^{3/2}\big\}\big)$的梯度复杂度实现；若损失函数光滑，则仅需$\tilde{O}(n)$梯度复杂度。作为方法的副产品，我们开发了一种通用算法：给定一个满足关于经验目标函数特定$\alpha$原始-对偶精度保证的黑盒子程序，该算法可为随机鞍点问题提供$\tilde{O}(\alpha+\frac{1}{\sqrt{n}})$的强间隙解。我们证明，这一$\alpha$精度条件可通过经验鞍点问题的标准算法（如邻近点方法和随机梯度下降上升算法）满足。进一步地，我们指出即便对于简单问题，算法可能具有零弱间隙却遭受$\Omega(1)$强间隙。我们还证明稳定性和精度之间存在根本性权衡：具体而言，任何$\Delta$-稳定算法均具有$\Omega\big(\frac{1}{\Delta n}\big)$的经验间隙，且该界是紧的。该结论对经验风险最小化问题也成立，可能具有独立的研究价值。