Few-shot learning, a challenging task in machine learning, aims to learn a classifier adaptable to recognize new, unseen classes with limited labeled examples. Meta-learning has emerged as a prominent framework for few-shot learning. Its training framework is originally a task-level learning method, such as Model-Agnostic Meta-Learning (MAML) and Prototypical Networks. And a recently proposed training paradigm called Meta-Baseline, which consists of sequential pre-training and meta-training stages, gains state-of-the-art performance. However, as a non-end-to-end training method, indicating the meta-training stage can only begin after the completion of pre-training, Meta-Baseline suffers from higher training cost and suboptimal performance due to the inherent conflicts of the two training stages. To address these limitations, we propose an end-to-end training paradigm consisting of two alternative loops. In the outer loop, we calculate cross entropy loss on the entire training set while updating only the final linear layer. In the inner loop, we employ the original meta-learning training mode to calculate the loss and incorporate gradients from the outer loss to guide the parameter updates. This training paradigm not only converges quickly but also outperforms existing baselines, indicating that information from the overall training set and the meta-learning training paradigm could mutually reinforce one another. Moreover, being model-agnostic, our framework achieves significant performance gains, surpassing the baseline systems by approximate 1%.
翻译:小样本学习是机器学习中的一项挑战性任务,旨在利用有限的标注样本训练能够适应识别新类别的分类器。元学习已成为小样本学习的重要框架,其训练机制最初为任务级学习方法,如模型无关元学习(MAML)和原型网络。近期提出的Meta-Baseline训练范式由顺序预训练与元训练阶段组成,取得了当前最优性能。然而,作为一种非端到端训练方法,其元训练阶段需待预训练完成后才能启动,导致训练成本较高,且因两阶段的内在冲突导致性能次优。为解决上述问题,我们提出了包含双交替循环的端到端训练范式:在外循环中,仅更新最终线性层时计算整个训练集的交叉熵损失;在内循环中,采用原始元学习训练模式计算损失,并融合外循环损失梯度指导参数更新。该训练范式不仅收敛迅速,且性能超越现有基线模型,表明整体训练集信息与元学习训练范式可相互增强。此外,作为模型无关框架,我们的方法实现了显著性能提升,相较基线系统精度提高约1%。