Few-shot learning aims to train models that can be generalized to novel classes with only a few samples. Recently, a line of works are proposed to enhance few-shot learning with accessible semantic information from class names. However, these works focus on improving existing modules such as visual prototypes and feature extractors of the standard few-shot learning framework. This limits the full potential use of semantic information. In this paper, we propose a novel few-shot learning framework that uses pre-trained language models based on contrastive learning. To address the challenge of alignment between visual features and textual embeddings obtained from text-based pre-trained language model, we carefully design the textual branch of our framework and introduce a metric module to generalize the cosine similarity. For better transferability, we let the metric module adapt to different few-shot tasks and adopt MAML to train the model via bi-level optimization. Moreover, we conduct extensive experiments on multiple benchmarks to demonstrate the effectiveness of our method.
翻译:小样本学习旨在训练那些仅凭少量样本即可泛化到新类别的模型。近年来,一系列研究致力于利用类别名称中可获取的语义信息来增强小样本学习性能。然而,这些工作主要聚焦于改进标准小样本学习框架中的现有模块,例如视觉原型和特征提取器,这限制了语义信息潜力的充分发掘。本文提出了一种基于对比学习的新型小样本学习框架,该框架采用预训练语言模型。为应对从基于文本的预训练语言模型中获取的视觉特征与文本嵌入之间的对齐挑战,我们精心设计了框架中的文本分支,并引入了一个度量模块来泛化余弦相似度。为提升迁移能力,我们使该度量模块能够适应不同的小样本任务,并采用模型无关元学习(MAML)通过双层优化训练模型。此外,我们在多个基准数据集上进行了广泛实验,验证了该方法的有效性。