Differentially Private Algorithms for the Stochastic Saddle Point Problem with Optimal Rates for the Strong Gap

We show that convex-concave Lipschitz stochastic saddle point problems (also known as stochastic minimax optimization) can be solved under the constraint of $(\epsilon,\delta)$-differential privacy with \emph{strong (primal-dual) gap} rate of $\tilde O\big(\frac{1}{\sqrt{n}} + \frac{\sqrt{d}}{n\epsilon}\big)$, where $n$ is the dataset size and $d$ is the dimension of the problem. This rate is nearly optimal, based on existing lower bounds in differentially private stochastic optimization. Specifically, we prove a tight upper bound on the strong gap via novel implementation and analysis of the recursive regularization technique repurposed for saddle point problems. We show that this rate can be attained with $O\big(\min\big\{\frac{n^2\epsilon^{1.5}}{\sqrt{d}}, n^{3/2}\big\}\big)$ gradient complexity, and $O(n)$ gradient complexity if the loss function is smooth. As a byproduct of our method, we develop a general algorithm that, given a black-box access to a subroutine satisfying a certain $\alpha$ primal-dual accuracy guarantee with respect to the empirical objective, gives a solution to the stochastic saddle point problem with a strong gap of $\tilde{O}(\alpha+\frac{1}{\sqrt{n}})$. We show that this $\alpha$-accuracy condition is satisfied by standard algorithms for the empirical saddle point problem such as the proximal point method and the stochastic gradient descent ascent algorithm. Further, we show that even for simple problems it is possible for an algorithm to have zero weak gap and suffer from $\Omega(1)$ strong gap. We also show that there exists a fundamental tradeoff between stability and accuracy. Specifically, we show that any $\Delta$-stable algorithm has empirical gap $\Omega\big(\frac{1}{\Delta n}\big)$, and that this bound is tight. This result also holds also more specifically for empirical risk minimization problems and may be of independent interest.

翻译：我们证明凸-凹Lipschitz随机鞍点问题（也称为随机极小极大优化）可在$(\epsilon,\delta)$-差分隐私约束下求解，其\emph{强（原始-对偶）间隙}率达到$\tilde O\big(\frac{1}{\sqrt{n}} + \frac{\sqrt{d}}{n\epsilon}\big)$，其中$n$为数据集规模，$d$为问题维度。基于现有差分隐私随机优化的下界，该速率接近最优。具体而言，我们通过针对鞍点问题重新设计的递归正则化技术的新颖实现与分析，证明了强间隙的紧上界。我们表明，该速率可在$O\big(\min\big\{\frac{n^2\epsilon^{1.5}}{\sqrt{d}}, n^{3/2}\big\}\big)$的梯度复杂度下达到，且若损失函数光滑，则梯度复杂度可降至$O(n)$。作为方法的副产品，我们提出通用算法：给定一个黑盒子访问满足关于经验目标的特定$\alpha$原始-对偶精度保证的子程序，该算法可得到随机鞍点问题具有$\tilde{O}(\alpha+\frac{1}{\sqrt{n}})$强间隙的解。我们证明此$\alpha$精度条件可通过经验鞍点问题的标准算法（如近端点方法和随机梯度下降上升算法）满足。进一步，我们证明即使对于简单问题，算法也可能具有零弱间隙但承受$\Omega(1)$强间隙。我们还证明稳定性与准确性之间存在根本性权衡：任何$\Delta$-稳定算法均具有经验间隙$\Omega\big(\frac{1}{\Delta n}\big)$，且该界为紧的。该结果同样适用于经验风险最小化问题，并可能具有独立研究价值。