Recommender systems embody significant commercial value and represent crucial intellectual property. However, the integrity of these systems is constantly challenged by malicious actors seeking to steal their underlying models. Safeguarding against such threats is paramount to upholding the rights and interests of the model owner. While model watermarking has emerged as a potent defense mechanism in various domains, its direct application to recommender systems remains unexplored and non-trivial. In this paper, we address this gap by introducing Autoregressive Out-of-distribution Watermarking (AOW), a novel technique tailored specifically for recommender systems. Our approach entails selecting an initial item and querying it through the oracle model, followed by the selection of subsequent items with small prediction scores. This iterative process generates a watermark sequence autoregressively, which is then ingrained into the model's memory through training. To assess the efficacy of the watermark, the model is tasked with predicting the subsequent item given a truncated watermark sequence. Through extensive experimentation and analysis, we demonstrate the superior performance and robust properties of AOW. Notably, our watermarking technique exhibits high-confidence extraction capabilities and maintains effectiveness even in the face of distillation and fine-tuning processes.
翻译:推荐系统蕴含巨大的商业价值,是重要的知识产权资产。然而,这些系统的完整性不断受到试图窃取其底层模型的恶意行为者的挑战。防范此类威胁对于维护模型所有者的权益至关重要。尽管模型水印技术已在多个领域成为有效的防御机制,但其在推荐系统中的直接应用尚未得到探索且非易事。本文通过提出一种专为推荐系统设计的创新技术——自回归分布外水印(Autoregressive Out-of-distribution Watermarking, AOW),以填补这一空白。我们的方法包括选择一个初始项目并通过预言机模型进行查询,随后选择预测分数较低的后继项目。这一迭代过程以自回归方式生成水印序列,随后通过训练将其嵌入模型的记忆之中。为评估水印的有效性,模型需根据截断的水印序列预测后续项目。通过大量实验与分析,我们证明了AOW的卓越性能和鲁棒特性。值得注意的是,我们的水印技术展现出高置信度的提取能力,即使在面对蒸馏和微调处理时仍能保持有效性。