Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models -- which we refer to as d\'ej\`a vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that d\'ej\`a vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of d\'ej\`a vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies. Code is available at https://github.com/facebookresearch/DejaVu.
翻译:自监督学习(SSL)算法通过关联自然图像中不同部分的方式,能够生成具有实用价值的图像表征。然而,在极端情况下,SSL模型可能并非学习有语义意义的关联,而是无意中记忆了单个训练样本的特定区域。本研究系统探讨了SSL模型中图像特有信息的无意记忆现象——我们将此类现象定义为“似曾相识”记忆。具体而言,我们证明:给定训练后的模型及仅包含背景(如水、天空、草地)的训练图像裁剪块,模型能够以高准确率推断甚至可视化重建前景物体。研究进一步表明,“似曾相识”记忆现象普遍存在于不同SSL算法中,特定设计选择会加剧该现象,且常规表征质量评估技术无法检测此类记忆。我们的研究揭示了SSL模型中此前未知的隐私风险,并提出了潜在的实际缓解策略。代码已发布于https://github.com/facebookresearch/DejaVu。