Deep learning has achieved remarkable progress in various applications, heightening the importance of safeguarding the intellectual property (IP) of well-trained models. It entails not only authorizing usage but also ensuring the deployment of models in authorized data domains, i.e., making models exclusive to certain target domains. Previous methods necessitate concurrent access to source training data and target unauthorized data when performing IP protection, making them risky and inefficient for decentralized private data. In this paper, we target a practical setting where only a well-trained source model is available and investigate how we can realize IP protection. To achieve this, we propose a novel MAsk Pruning (MAP) framework. MAP stems from an intuitive hypothesis, i.e., there are target-related parameters in a well-trained model, locating and pruning them is the key to IP protection. Technically, MAP freezes the source model and learns a target-specific binary mask to prevent unauthorized data usage while minimizing performance degradation on authorized data. Moreover, we introduce a new metric aimed at achieving a better balance between source and target performance degradation. To verify the effectiveness and versatility, we have evaluated MAP in a variety of scenarios, including vanilla source-available, practical source-free, and challenging data-free. Extensive experiments indicate that MAP yields new state-of-the-art performance.
翻译:深度学习在各个应用中取得了显著进展,这加剧了保护训练有素模型知识产权(IP)的重要性。这不仅涉及授权使用,还需确保模型在授权数据域中部署,即让模型仅适用于特定目标域。以往的方法在执行知识产权保护时,需要同时访问源训练数据和目标未授权数据,这使其在处理分散的私有数据时存在风险且效率低下。本文针对一个实际场景(仅能获取训练有素的源模型),探讨如何实现知识产权保护。为此,我们提出了一种新颖的掩码剪枝(MAP)框架。MAP源于一个直观假设:训练好的模型中存在与目标相关的参数,定位并剪除这些参数是实现知识产权保护的关键。在技术上,MAP冻结源模型,学习一个目标特定的二值掩码,以阻止未授权数据的使用,同时最小化对授权数据性能的降低。此外,我们引入了一个新指标,旨在更好地平衡源域与目标域的性能损失。为验证其有效性与通用性,我们在多种场景下评估了MAP,包括传统的源可用场景、实际的无源场景以及具有挑战性的无数据场景。大量实验表明,MAP取得了新的最优性能。