Disrupting Style Mimicry Attacks on Video Imagery

Generative AI models are often used to perform mimicry attacks, where a pretrained model is fine-tuned on a small sample of images to learn to mimic a specific artist of interest. While researchers have introduced multiple anti-mimicry protection tools (Mist, Glaze, Anti-Dreambooth), recent evidence points to a growing trend of mimicry models using videos as sources of training data. This paper presents our experiences exploring techniques to disrupt style mimicry on video imagery. We first validate that mimicry attacks can succeed by training on individual frames extracted from videos. We show that while anti-mimicry tools can offer protection when applied to individual frames, this approach is vulnerable to an adaptive countermeasure that removes protection by exploiting randomness in optimization results of consecutive (nearly-identical) frames. We develop a new, tool-agnostic framework that segments videos into short scenes based on frame-level similarity, and use a per-scene optimization baseline to remove inter-frame randomization while reducing computational cost. We show via both image level metrics and an end-to-end user study that the resulting protection restores protection against mimicry (including the countermeasure). Finally, we develop another adaptive countermeasure and find that it falls short against our framework.

翻译：生成式AI模型常被用于实施风格模仿攻击，即通过在少量图像样本上微调预训练模型，使其学会模仿特定艺术家的风格。尽管研究者已提出多种反模仿保护工具（如Mist、Glaze、Anti-Dreambooth），但最新证据表明，模仿模型正越来越多地使用视频作为训练数据源。本文介绍我们探索破坏视频图像风格模仿技术的实践经验。我们首先验证了通过提取视频中单帧图像进行训练可成功实施模仿攻击，并发现虽然反模仿工具能对单帧图像提供保护，但这种保护易受一种自适应反制手段的破解——该手段通过利用相邻（近乎相同）帧在优化结果中的随机性差异来消除保护。我们提出了一种与具体工具无关的新型框架，该框架基于帧间相似性将视频分割为短场景片段，并通过逐场景优化基线消除帧间随机化，同时降低计算成本。通过图像级指标和端到端用户研究，我们证明该框架生成的保护能有效抵御模仿攻击（包括上述反制手段）。最后，我们开发了另一种自适应反制手段，但发现其无法突破我们的框架。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日