The hidden state threat model of differential privacy (DP) assumes that the adversary has access only to the final trained machine learning (ML) model, without seeing intermediate states during training. However, the current privacy analyses under this model are restricted to convex optimization problems, reducing their applicability to multi-layer neural networks, which are essential in modern deep learning applications. Notably, the most successful applications of the hidden state privacy analyses in classification tasks have only been for logistic regression models. We demonstrate that it is possible to privately train convex problems with privacy-utility trade-offs comparable to those of 2-layer ReLU networks trained with DP stochastic gradient descent (DP-SGD). This is achieved through a stochastic approximation of a dual formulation of the ReLU minimization problem, resulting in a strongly convex problem. This enables the use of existing hidden state privacy analyses and provides accurate privacy bounds also for the noisy cyclic mini-batch gradient descent (NoisyCGD) method with fixed disjoint mini-batches. Empirical results on benchmark classification tasks demonstrate that NoisyCGD can achieve privacy-utility trade-offs on par with DP-SGD applied to 2-layer ReLU networks.
翻译:差分隐私(DP)的隐藏状态威胁模型假设攻击者仅能获取最终训练完成的机器学习(ML)模型,而无法观察到训练过程中的中间状态。然而,当前该模型下的隐私分析仅限于凸优化问题,这限制了其在多层神经网络(现代深度学习应用的核心)中的适用性。值得注意的是,隐藏状态隐私分析在分类任务中最成功的应用仅限于逻辑回归模型。我们证明,可以通过隐私-效用权衡(与使用差分隐私随机梯度下降(DP-SGD)训练的两层ReLU网络相当)来私有地训练凸问题。这是通过ReLU最小化问题对偶形式的随机逼近实现的,该形式可转化为强凸问题。这使得现有隐藏状态隐私分析得以应用,并为具有固定不相交小批量的含噪循环小批量梯度下降(NoisyCGD)方法提供了精确的隐私界。基准分类任务的实证结果表明,NoisyCGD在隐私-效用权衡上可达到与应用于两层ReLU网络的DP-SGD相媲美的效果。