Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations

While diffusion models excel at generating high-quality images, their tendency to memorize training data poses significant privacy and copyright risks. In this work, we for the first time identify that memorization induces internal numerical instability, often manifesting as visually ``broken'' artifacts. Inspired by stability analysis in numerical methods, we introduce empirical stability regions based on latent update norms to quantitatively characterize stable behavior during generation. Leveraging this, we propose a principled, on-the-fly framework for step-wise detection and adaptive mitigation. Our approach suppresses memorization without altering prompts or guidance, thereby preserving semantic fidelity and image quality. Extensive experiments on Stable Diffusion 1.4 demonstrate that our method achieves an AUC $>0.999$ detection performance and a $0.0\%$ memorization rate after mitigation with negligible overhead ($\approx0.01$s per image).

翻译：尽管扩散模型在生成高质量图像方面表现出色，但其记忆训练数据的倾向带来了显著的隐私与版权风险。本研究首次发现，记忆化会导致内部数值不稳定性，通常表现为视觉上"破碎"的伪影。受数值方法中稳定性分析的启发，我们基于潜在更新范数引入经验稳定性区域，以定量刻画生成过程中的稳定行为。基于此，我们提出了一种原则性的即时框架，用于逐步骤检测与自适应缓解。该方法无需修改提示词或引导条件即可抑制记忆化，从而保持语义保真度与图像质量。在Stable Diffusion 1.4上的大量实验表明，我们的方法在检测阶段AUC > 0.999，缓解后记忆化率为0.0%，且额外开销可忽略不计（约每张图像0.01秒）。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

用于强化学习的扩散模型：基础、分类与发展

专知会员服务

23+阅读 · 2025年10月15日

【CVPR2025】概念护卫：具备遗忘与混淆缓解机制的持续个性化文本生成图像方法

专知会员服务

8+阅读 · 2025年4月17日

《基于扩散模型的条件图像生成》综述

专知会员服务

44+阅读 · 2024年10月1日

大模型如何遗忘不良知识？最新《生成式人工智能中的机器遗忘》综述

专知会员服务

24+阅读 · 2024年8月1日