As deep generative models have progressed, recent work has shown them to be capable of memorizing and reproducing training datapoints when deployed. These findings call into question the usability of generative models, especially in light of the legal and privacy risks brought about by memorization. To better understand this phenomenon, we propose the manifold memorization hypothesis (MMH), a geometric framework which leverages the manifold hypothesis into a clear language in which to reason about memorization. We propose to analyze memorization in terms of the relationship between the dimensionalities of (i) the ground truth data manifold and (ii) the manifold learned by the model. This framework provides a formal standard for "how memorized" a datapoint is and systematically categorizes memorized data into two types: memorization driven by overfitting and memorization driven by the underlying data distribution. By analyzing prior work in the context of the MMH, we explain and unify assorted observations in the literature. We empirically validate the MMH using synthetic data and image datasets up to the scale of Stable Diffusion, developing new tools for detecting and preventing generation of memorized samples in the process.
翻译:随着深度生成模型的发展,近期研究表明,这些模型在部署时能够记忆并复现训练数据点。这些发现对生成模型的可用性提出了质疑,尤其是在考虑到记忆行为所带来的法律和隐私风险时。为了更好地理解这一现象,我们提出了流形记忆假说(MMH)——一个将流形假说转化为清晰推理语言的几何框架,用于系统分析记忆现象。我们建议从以下两个维度的关系角度分析记忆行为:(i)真实数据流形的维度,以及(ii)模型学习到的流形维度。该框架为"数据被记忆的程度"提供了形式化标准,并将记忆数据系统性地分为两类:由过拟合驱动的记忆和由底层数据分布驱动的记忆。通过在MMH框架下分析现有研究,我们解释并统一了文献中的各类观察结果。我们使用合成数据和图像数据集(规模达到Stable Diffusion级别)对MMH进行了实证验证,并在此过程中开发了用于检测和防止生成记忆样本的新工具。