Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently, in particular with the use of massive batches and aggregated data augmentations for a large number of training steps. These techniques require much more computing resources than their non-private counterparts, shifting the traditional privacy-accuracy trade-off to a privacy-accuracy-compute trade-off and making hyper-parameter search virtually impossible for realistic scenarios. In this work, we decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements. We first use the tools of R\'enyi Differential Privacy (RDP) to highlight that the privacy budget, when not overcharged, only depends on the total amount of noise (TAN) injected throughout training. We then derive scaling laws for training models with DP-SGD to optimize hyper-parameters with more than a $100\times$ reduction in computational budget. We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy for a privacy budget epsilon=8.
翻译:差分隐私深度神经网络训练方法近期取得了进展,特别是通过使用大批量数据和聚合数据增强进行大量训练步骤。这些技术比非私有方法需要更多的计算资源,将传统的隐私-准确性权衡转变为隐私-准确性-计算权衡,使得在实际场景中几乎无法进行超参数搜索。在这项工作中,我们将隐私分析与噪声训练的实验行为解耦,以在最小计算需求下探索这种权衡。我们首先使用Rényi差分隐私工具强调,隐私预算在不被过度消耗时,仅取决于训练过程中注入的总噪声量。然后,我们推导了使用DP-SGD训练模型的缩放定律,以在计算预算减少超过100倍的情况下优化超参数。我们将所提出的方法应用于CIFAR-10和ImageNet,特别是在ImageNet上,在隐私预算epsilon=8时,top-1准确率提升了9个百分点,显著改进了当前最优水平。