With the increasing prevalence of Machine Learning as a Service (MLaaS) platforms, there is a growing focus on deep neural network (DNN) watermarking techniques. These methods are used to facilitate the verification of ownership for a target DNN model to protect intellectual property. One of the most widely employed watermarking techniques involves embedding a trigger set into the source model. Unfortunately, existing methodologies based on trigger sets are still susceptible to functionality-stealing attacks, potentially enabling adversaries to steal the functionality of the source model without a reliable means of verifying ownership. In this paper, we first introduce a novel perspective on trigger set-based watermarking methods from a feature learning perspective. Specifically, we demonstrate that by selecting data exhibiting multiple features, also referred to as $\textit{multi-view data}$, it becomes feasible to effectively defend functionality stealing attacks. Based on this perspective, we introduce a novel watermarking technique based on Multi-view dATa, called MAT, for efficiently embedding watermarks within DNNs. This approach involves constructing a trigger set with multi-view data and incorporating a simple feature-based regularization method for training the source model. We validate our method across various benchmarks and demonstrate its efficacy in defending against model extraction attacks, surpassing relevant baselines by a significant margin.
翻译:随着机器学习即服务(MLaaS)平台的日益普及,深度神经网络(DNN)水印技术受到越来越多的关注。这些方法用于验证目标DNN模型的所有权,以保护知识产权。最广泛使用的水印技术之一是在源模型中嵌入触发集。遗憾的是,现有基于触发集的方法仍易受功能窃取攻击的影响,可能使攻击者窃取源模型的功能而无法可靠验证所有权。本文首先从特征学习视角提出一种关于基于触发集水印方法的新见解。具体而言,我们证明通过选择展现多种特征的数据(也称为多视角数据,$\textit{multi-view data}$),可以有效防御功能窃取攻击。基于这一见解,我们提出一种基于多视角数据的新型水印技术,命名为MAT,用于在DNN中高效嵌入水印。该方法利用多视角数据构建触发集,并引入一种简单的基于特征的正则化方法来训练源模型。我们在多种基准上验证了该方法,并证明其在防御模型提取攻击方面的有效性,显著超越了相关基线方法。