Differentially private stochastic gradient descent (DP-SGD) is the standard algorithm for training machine learning models under differential privacy (DP). The major drawback of DP-SGD is the drop in utility which prior work has comprehensively studied. However, in practice another major drawback that hinders the large-scale deployment is the significantly higher computational cost. We conduct a comprehensive empirical study to quantify the computational cost of training deep learning models under DP and benchmark methods that aim at reducing the cost. Among these are more efficient implementations of DP-SGD and training with lower precision. Finally, we study the scaling behaviour using up to 80 GPUs.
翻译:差分隐私随机梯度下降(DP-SGD)是在差分隐私(DP)约束下训练机器学习模型的标准算法。DP-SGD的主要缺陷在于其效用下降问题,已有研究对此进行了全面分析。然而,在实际应用中,另一个阻碍其大规模部署的关键缺陷是其显著更高的计算成本。我们通过系统的实证研究量化了差分隐私条件下深度学习模型训练的计算开销,并对旨在降低计算成本的方法进行了基准测试。这些方法包括更高效的DP-SGD实现方案以及低精度训练技术。最后,我们使用多达80个GPU研究了其扩展性能。