Equivariant Differentially Private Deep Learning: Why DP-SGD Needs Sparser Models

Differentially Private Stochastic Gradient Descent (DP-SGD) limits the amount of private information deep learning models can memorize during training. This is achieved by clipping and adding noise to the model's gradients, and thus networks with more parameters require proportionally stronger perturbation. As a result, large models have difficulties learning useful information, rendering training with DP-SGD exceedingly difficult on more challenging training tasks. Recent research has focused on combating this challenge through training adaptations such as heavy data augmentation and large batch sizes. However, these techniques further increase the computational overhead of DP-SGD and reduce its practical applicability. In this work, we propose using the principle of sparse model design to solve precisely such complex tasks with fewer parameters, higher accuracy, and in less time, thus serving as a promising direction for DP-SGD. We achieve such sparsity by design by introducing equivariant convolutional networks for model training with Differential Privacy. Using equivariant networks, we show that small and efficient architecture design can outperform current state-of-the-art models with substantially lower computational requirements. On CIFAR-10, we achieve an increase of up to $9\%$ in accuracy while reducing the computation time by more than $85\%$. Our results are a step towards efficient model architectures that make optimal use of their parameters and bridge the privacy-utility gap between private and non-private deep learning for computer vision.

翻译：差分隐私随机梯度下降（DP-SGD）通过裁剪和添加噪声来限制深度学习模型在训练过程中记忆的私有信息量，因此参数更多的网络需要成比例地增强扰动。这导致大型模型难以学习有用信息，使得在更具挑战性的训练任务中使用DP-SGD变得异常困难。近期研究通过训练适应性调整（如重度数据增强和大批量大小）来应对这一挑战。然而，这些技术进一步增加了DP-SGD的计算开销并降低了其实用性。在本工作中，我们提出利用稀疏模型设计原则，以更少的参数、更高的精度和更短的时间精确解决此类复杂任务，从而为DP-SGD提供了有前景的发展方向。我们通过引入等变卷积网络进行差分隐私模型训练，在设计中实现了这种稀疏性。利用等变网络，我们证明了小型高效架构设计能够显著降低计算需求并超越当前最先进模型。在CIFAR-10数据集上，我们在计算时间减少超过85%的同时实现了高达9%的准确率提升。我们的研究成果朝着构建高效利用参数的模型架构迈出一步，并弥合了计算机视觉中私有深度学习与非私有深度学习之间的隐私-效用差距。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日