Unsupervised meta-learning aims to learn feature representations from unsupervised datasets that can transfer to downstream tasks with limited labeled data. In this paper, we propose a novel approach to unsupervised meta-learning that leverages the generalization abilities of in-context learning observed in transformer architectures. Our method reframes meta-learning as a sequence modeling problem, enabling the transformer encoder to learn task context from support images and utilize it to predict query images. At the core of our approach lies the creation of diverse tasks generated using a combination of data augmentations and a mixing strategy that challenges the model during training while fostering generalization to unseen tasks at test time. Experimental results on benchmark datasets showcase the superiority of our approach over existing unsupervised meta-learning baselines, establishing it as the new state-of-the-art in the field. Remarkably, our method achieves competitive results with supervised and self-supervised approaches, underscoring the efficacy of the model in leveraging generalization over memorization.
翻译:无监督元学习旨在从无监督数据集中学习特征表示,使其能够迁移至标注数据有限的下游任务。本文提出一种新颖的无监督元学习方法,该方法利用Transformer架构中观察到的上下文学习的泛化能力。我们将元学习重新定义为序列建模问题,使Transformer编码器能够从支持图像中学习任务上下文,并利用该上下文预测查询图像。本方法的核心在于通过数据增强与混合策略相结合生成多样化任务,这些任务在训练过程中挑战模型,同时促进其在测试时对未见任务的泛化能力。在基准数据集上的实验结果表明,本方法优于现有的无监督元学习基线,确立了该领域新的最优性能。值得注意的是,本方法取得了与监督学习和自监督学习方法相竞争的结果,突显了模型在利用泛化而非记忆方面的有效性。