The recent rise of Self-Supervised Learning (SSL) as one of the preferred strategies for learning with limited labeled data, and abundant unlabeled data has led to the widespread use of these models. They are usually pretrained, finetuned, and evaluated on the same data distribution, i.e., within an in-distribution setting. However, they tend to perform poorly in out-of-distribution evaluation scenarios, a challenge that Unsupervised Domain Generalization (UDG) seeks to address. This paper introduces a novel method to standardize the styles of images in a batch. Batch styles standardization, relying on Fourier-based augmentations, promotes domain invariance in SSL by preventing spurious correlations from leaking into the features. The combination of batch styles standardization with the well-known contrastive-based method SimCLR leads to a novel UDG method named CLaSSy ($\textbf{C}$ontrastive $\textbf{L}$e$\textbf{a}$rning with $\textbf{S}$tandardized $\textbf{S}$t$\textbf{y}$les). CLaSSy offers serious advantages over prior methods, as it does not rely on domain labels and is scalable to handle a large number of domains. Experimental results on various UDG datasets demonstrate the superior performance of CLaSSy compared to existing UDG methods. Finally, the versatility of the proposed batch styles standardization is demonstrated by extending respectively the contrastive-based and non-contrastive-based SSL methods, SWaV and MSN, while considering different backbone architectures (convolutional-based, transformers-based).
翻译:近期,自监督学习作为在有限标注数据和大量未标注数据场景下的优先学习策略之一,其兴起推动了这些模型的广泛应用。这些模型通常在同一数据分布下(即分布内设置)进行预训练、微调和评估。然而,它们在分布外评估场景中往往表现不佳,这一挑战正是无监督领域泛化所旨在解决的问题。本文介绍了一种新颖的方法,用于标准化批次中图像的风格。基于傅里叶变换增强的批量风格标准化,通过防止虚假相关性泄露到特征中,促进了自监督学习中的领域不变性。将批量风格标准化与著名的基于对比的方法SimCLR相结合,产生了一种名为CLaSSy的新颖无监督领域泛化方法($\textbf{C}$ontrastive $\textbf{L}$e$\textbf{a}$rning with $\textbf{S}$tandardized $\textbf{S}$t$\textbf{y}$les)。CLaSSy相较于先前方法具有显著优势,因为它不依赖领域标签,并且可扩展以处理大量领域。在多种无监督领域泛化数据集上的实验结果表明,CLaSSy相比现有无监督领域泛化方法具有更优性能。最后,通过分别扩展基于对比和非对比的自监督学习方法SWaV和MSN,并考虑不同的骨干架构(基于卷积的、基于Transformer的),展示了所提出的批量风格标准化的多功能性。