Unsupervised Synthetic Image Attribution: Alignment and Disentanglement

As the quality of synthetic images improves, identifying the underlying concepts of model-generated images is becoming increasingly crucial for copyright protection and ensuring model transparency. Existing methods achieve this attribution goal by training models using annotated pairs of synthetic images and their original training sources. However, obtaining such paired supervision is challenging, as it requires either well-designed synthetic concepts or precise annotations from millions of training sources. To eliminate the need for costly paired annotations, in this paper, we explore the possibility of unsupervised synthetic image attribution. We propose a simple yet effective unsupervised method called Alignment and Disentanglement. Specifically, we begin by performing basic concept alignment using contrastive self-supervised learning. Next, we enhance the model's attribution ability by promoting representation disentanglement with the Infomax loss. This approach is motivated by an interesting observation: contrastive self-supervised models, such as MoCo and DINO, inherently exhibit the ability to perform simple cross-domain alignment. By formulating this observation as a theoretical assumption on cross-covariance, we provide a theoretical explanation of how alignment and disentanglement can approximate the concept-matching process through a decomposition of the canonical correlation analysis objective. On the real-world benchmarks, AbC, we show that our unsupervised method surprisingly outperforms the supervised methods. As a starting point, we expect our intuitive insights and experimental findings to provide a fresh perspective on this challenging task.

翻译：随着合成图像质量的提升，识别模型生成图像背后的概念对于版权保护和确保模型透明度变得日益关键。现有方法通过使用标注的合成图像及其原始训练源对来训练模型，以实现这一溯源目标。然而，获取此类配对监督具有挑战性，因为它需要精心设计的合成概念或来自数百万训练源的精确标注。为消除对昂贵配对标注的需求，本文探索了无监督合成图像溯源的可能性。我们提出了一种简单而有效的无监督方法，称为对齐与解缠。具体而言，我们首先利用对比自监督学习进行基础概念对齐。接着，我们通过引入信息最大化损失促进表示解缠，从而增强模型的溯源能力。这一方法的动机源于一个有趣的观察：对比自监督模型（如MoCo和DINO）本身展现出执行简单跨域对齐的能力。通过将该观察形式化为关于互协方差的理论假设，我们提供了对齐与解缠如何通过分解典型相关分析目标来近似概念匹配过程的理论解释。在真实世界基准测试集AbC上，我们的无监督方法意外地超越了有监督方法。作为一个起点，我们希望这些直观见解和实验结果能为这一具有挑战性的任务提供新的视角。