Motivated by applications of large embedding models, we study differentially private (DP) optimization problems under sparsity of individual gradients. We start with new near-optimal bounds for the classic mean estimation problem but with sparse data, improving upon existing algorithms particularly for the high-dimensional regime. Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for stochastic convex optimization with sparse gradients; the former represents the first nearly dimension-independent rates for this problem. Finally, we study the approximation of stationary points for the empirical loss in approximate-DP optimization and obtain rates that depend on sparsity instead of dimension, modulo polylogarithmic factors.
翻译:受大规模嵌入模型应用的启发,我们研究了个体梯度稀疏性条件下的差分隐私(DP)优化问题。我们首先针对经典均值估计问题在稀疏数据条件下提出了新的近最优界,特别在高维情形下改进了现有算法。在此基础上,我们针对稀疏梯度随机凸优化问题获得了具有近乎最优收敛率的纯差分隐私与近似差分隐私算法;前者首次实现了该问题中几乎与维度无关的收敛率。最后,我们研究了近似差分隐私优化中经验损失平稳点的逼近问题,获得了依赖稀疏性而非维度的收敛率(忽略多对数因子)。