Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models -- which we refer to as d\'ej\`a vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that d\'ej\`a vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of d\'ej\`a vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies. Code is available at https://github.com/facebookresearch/DejaVu.
翻译:自监督学习(SSL)算法通过学习自然图像中不同部分之间的关联,能够生成有用的图像表征。然而,在极端情况下,SSL模型可能无意地记忆单个训练样本中的特定细节,而非学习有语义意义的关联。本文系统研究了SSL模型中针对图像特定信息的无意记忆现象,我们将其称为“既视感记忆”(déjà vu memorization)。具体而言,我们证明,给定已训练的模型和仅包含背景(如水面、天空、草地)的训练图像裁剪区域,能够以高准确率推断甚至视觉重建前景物体。此外,我们发现既视感记忆在不同SSL算法中普遍存在,且某些设计选择会加剧该现象,而传统表征质量评估技术无法检测到这一问题。本研究揭示了SSL模型中此前未知的隐私风险,并提出了潜在的实用缓解策略。代码开源于https://github.com/facebookresearch/DejaVu。