Dual Student Networks for Data-Free Model Stealing

Existing data-free model stealing methods use a generator to produce samples in order to train a student model to match the target model outputs. To this end, the two main challenges are estimating gradients of the target model without access to its parameters, and generating a diverse set of training samples that thoroughly explores the input space. We propose a Dual Student method where two students are symmetrically trained in order to provide the generator a criterion to generate samples that the two students disagree on. On one hand, disagreement on a sample implies at least one student has classified the sample incorrectly when compared to the target model. This incentive towards disagreement implicitly encourages the generator to explore more diverse regions of the input space. On the other hand, our method utilizes gradients of student models to indirectly estimate gradients of the target model. We show that this novel training objective for the generator network is equivalent to optimizing a lower bound on the generator's loss if we had access to the target model gradients. We show that our new optimization framework provides more accurate gradient estimation of the target model and better accuracies on benchmark classification datasets. Additionally, our approach balances improved query efficiency with training computation cost. Finally, we demonstrate that our method serves as a better proxy model for transfer-based adversarial attacks than existing data-free model stealing methods.

翻译：现有无数据模型窃取方法通过生成器产生样本，训练学生模型以匹配目标模型输出。为此，主要面临两大挑战：在无法访问目标模型参数的情况下估计其梯度，以及生成能彻底探索输入空间的多样化训练样本。我们提出了一种双学生方法，对称训练两个学生模型，为生成器提供判定标准以生成两个学生模型存在分歧的样本。一方面，样本上的分歧意味着至少有一个学生模型相较于目标模型对该样本分类错误。这种对分歧的激励隐式地促使生成器探索输入空间中更多样的区域。另一方面，我们的方法利用学生模型的梯度间接估计目标模型的梯度。我们证明，这种新颖的生成器训练目标等价于在可访问目标模型梯度时优化生成器损失的下界。实验表明，我们的新优化框架能更准确地估计目标模型梯度，并在基准分类数据集上取得更高准确率。此外，我们的方法在提升查询效率与训练计算成本之间实现了平衡。最后，我们证明该方法作为迁移性对抗攻击的替代模型，优于现有无数据模型窃取方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日