The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance. Watermarking addresses this issue by embedding a unique identifier into the model, while preserving its performance. However, most existing approaches require to optimize the weights to imprint the watermark signal, which is not suitable at scale due to the computational cost. This paper explores watermarks with virtually no computational cost, applicable to a non-blind white-box setting (assuming access to both the original and watermarked networks). They generate functionally equivalent copies by leveraging the models' invariance, via operations like dimension permutations or scaling/unscaling. This enables to watermark models without any change in their outputs and remains stealthy. Experiments demonstrate the effectiveness of the approach and its robustness against various model transformations (fine-tuning, quantization, pruning), making it a practical solution to protect the integrity of large models.
翻译:基于Transformer的模型快速增长加剧了对其完整性与所有权保障的担忧。水印技术通过在模型中嵌入唯一标识符且不影响其性能来解决此问题。然而,现有方法大多需要优化权重以植入水印信号,由于计算成本高昂,难以适用于大规模场景。本文探索了几乎零计算成本的水印方法,适用于非盲白盒设置(假设可访问原始网络和水印网络)。该方法通过维度置换或缩放/去缩放等操作,利用模型的等价不变性生成功能完全相同的副本。这使得水印嵌入不改变模型输出且保持隐蔽性。实验表明该方法在多种模型变换(微调、量化、剪枝)下依然有效且鲁棒,成为保护大型模型完整性的实用解决方案。