A major challenge in applying differential privacy to training deep neural network models is scalability.The widely-used training algorithm, differentially private stochastic gradient descent (DP-SGD), struggles with training moderately-sized neural network models for a value of epsilon corresponding to a high level of privacy protection. In this paper, we explore the idea of dimensionality reduction inspired by neural network pruning to improve the scalability of DP-SGD. We study the interplay between neural network pruning and differential privacy, through the two modes of parameter updates. We call the first mode, parameter freezing, where we pre-prune the network and only update the remaining parameters using DP-SGD. We call the second mode, parameter selection, where we select which parameters to update at each step of training and update only those selected using DP-SGD. In these modes, we use public data for freezing or selecting parameters to avoid privacy loss incurring in these steps. Naturally, the closeness between the private and public data plays an important role in the success of this paradigm. Our experimental results demonstrate how decreasing the parameter space improves differentially private training. Moreover, by studying two popular forms of pruning which do not rely on gradients and do not incur an additional privacy loss, we show that random selection performs on par with magnitude-based selection when it comes to DP-SGD training.
翻译:将差分隐私应用于深度神经网络训练的主要挑战在于可扩展性。广泛使用的训练算法——差分隐私随机梯度下降(DP-SGD)——在保护隐私水平较高的ε值条件下训练中等规模的神经网络模型时面临困难。本文探索了受神经网络剪枝启发的降维思想,以提升DP-SGD的可扩展性。我们通过两种参数更新模式研究了神经网络剪枝与差分隐私之间的相互作用。第一种模式称为参数冻结,即预先剪枝网络,仅使用DP-SGD更新剩余参数。第二种模式称为参数选择,即在训练的每一步选择待更新参数,并仅用DP-SGD更新这些选定参数。在这两种模式下,我们使用公共数据进行参数冻结或选择,以避免这些步骤产生隐私损失。自然地,私有数据与公共数据之间的接近程度对此范式的成功至关重要。我们的实验结果表明,减少参数空间如何改善差分隐私训练。此外,通过研究两种不依赖梯度且不产生额外隐私损失的流行剪枝方式,我们发现,在DP-SGD训练中,随机选择的表现与基于幅度的选择相当。