Generative AI models are often used to perform mimicry attacks, where a pretrained model is fine-tuned on a small sample of images to learn to mimic a specific artist of interest. While researchers have introduced multiple anti-mimicry protection tools (Mist, Glaze, Anti-Dreambooth), recent evidence points to a growing trend of mimicry models using videos as sources of training data. This paper presents our experiences exploring techniques to disrupt style mimicry on video imagery. We first validate that mimicry attacks can succeed by training on individual frames extracted from videos. We show that while anti-mimicry tools can offer protection when applied to individual frames, this approach is vulnerable to an adaptive countermeasure that removes protection by exploiting randomness in optimization results of consecutive (nearly-identical) frames. We develop a new, tool-agnostic framework that segments videos into short scenes based on frame-level similarity, and use a per-scene optimization baseline to remove inter-frame randomization while reducing computational cost. We show via both image level metrics and an end-to-end user study that the resulting protection restores protection against mimicry (including the countermeasure). Finally, we develop another adaptive countermeasure and find that it falls short against our framework.
翻译:生成式AI模型常被用于实施风格模仿攻击,即通过在少量图像样本上微调预训练模型,使其学会模仿特定艺术家的风格。尽管研究者已提出多种反模仿保护工具(如Mist、Glaze、Anti-Dreambooth),但最新证据表明,模仿模型正越来越多地使用视频作为训练数据源。本文介绍我们探索破坏视频图像风格模仿技术的实践经验。我们首先验证了通过提取视频中单帧图像进行训练可成功实施模仿攻击,并发现虽然反模仿工具能对单帧图像提供保护,但这种保护易受一种自适应反制手段的破解——该手段通过利用相邻(近乎相同)帧在优化结果中的随机性差异来消除保护。我们提出了一种与具体工具无关的新型框架,该框架基于帧间相似性将视频分割为短场景片段,并通过逐场景优化基线消除帧间随机化,同时降低计算成本。通过图像级指标和端到端用户研究,我们证明该框架生成的保护能有效抵御模仿攻击(包括上述反制手段)。最后,我们开发了另一种自适应反制手段,但发现其无法突破我们的框架。