OVLA: Neural Network Ownership Verification using Latent Watermarks

Ownership verification for neural networks is important for protecting these models from illegal copying, free-riding, re-distribution and other intellectual property misuse. We present a novel methodology for neural network ownership verification based on the notion of latent watermarks. Existing ownership verification methods either modify or introduce constraints to the neural network parameters, which are accessible to an attacker in a white-box attack and can be harmful to the network's normal operation, or train the network to respond to specific watermarks in the inputs similar to data poisoning-based backdoor attacks, which are susceptible to backdoor removal techniques. In this paper, we address these problems by decoupling a network's normal operation from its responses to watermarked inputs during ownership verification. The key idea is to train the network such that the watermarks remain dormant unless the owner's secret key is applied to activate it. The secret key is realized as a specific perturbation only known to the owner to the network's parameters. We show that our approach offers strong defense against backdoor detection, backdoor removal and surrogate model attacks.In addition, our method provides protection against ambiguity attacks where the attacker either tries to guess the secret weight key or uses fine-tuning to embed their own watermarks with a different key into a pre-trained neural network. Experimental results demonstrate the advantages and effectiveness of our proposed approach.

翻译：神经网络的所有权验证对于保护这些模型免受非法复制、搭便车、再分发及其他知识产权滥用至关重要。本文提出了一种基于潜在水印概念的新型神经网络所有权验证方法。现有所有权验证方法要么修改或引入对神经网络参数的约束，这些参数在白盒攻击中可被攻击者获取且可能影响网络正常运行，要么训练网络对输入中的特定水印作出响应（类似于基于数据投毒的后门攻击），这类方法易受后门移除技术攻击。针对这些问题，本文通过将网络的正常运行与其在所有权验证过程中对带水印输入的响应进行解耦。核心思想是训练网络使得水印保持休眠状态，除非应用所有者的密钥进行激活。该密钥被实现为仅所有者已知的网络参数特定扰动。研究表明，该方法能有效防御后门检测、后门移除及替代模型攻击。此外，本方法还能抵抗歧义攻击——攻击者试图猜测秘密权重密钥或通过微调将自身水印嵌入预训练神经网络的攻击。实验结果验证了所提方法的优势与有效性。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日