Model fragile watermarking, inspired by both the field of adversarial attacks on neural networks and traditional multimedia fragile watermarking, has gradually emerged as a potent tool for detecting tampering, and has witnessed rapid development in recent years. Unlike robust watermarks, which are widely used for identifying model copyrights, fragile watermarks for models are designed to identify whether models have been subjected to unexpected alterations such as backdoors, poisoning, compression, among others. These alterations can pose unknown risks to model users, such as misidentifying stop signs as speed limit signs in classic autonomous driving scenarios. This paper provides an overview of the relevant work in the field of model fragile watermarking since its inception, categorizing them and revealing the developmental trajectory of the field, thus offering a comprehensive survey for future endeavors in model fragile watermarking.
翻译:模型脆弱水印技术,其灵感同时来源于神经网络对抗攻击领域和传统多媒体脆弱水印领域,已逐渐发展为一种检测篡改的有效工具,并在近年来见证了快速发展。与广泛用于识别模型版权的鲁棒水印不同,模型脆弱水印旨在识别模型是否遭受了意外的修改,例如后门、投毒、压缩等。这些修改可能给模型使用者带来未知风险,例如在经典的自动驾驶场景中将停车标志误识别为限速标志。本文概述了模型脆弱水印领域自诞生以来的相关工作,对其进行了分类并揭示了该领域的发展轨迹,从而为模型脆弱水印的未来工作提供了一份全面的综述。