Gradient-based meta-learning techniques aim to distill useful prior knowledge from a set of training tasks such that new tasks can be learned more efficiently with gradient descent. While these methods have achieved successes in various scenarios, they commonly adapt all parameters of trainable layers when learning new tasks. This neglects potentially more efficient learning strategies for a given task distribution and may be susceptible to overfitting, especially in few-shot learning where tasks must be learned from a limited number of examples. To address these issues, we propose Subspace Adaptation Prior (SAP), a novel gradient-based meta-learning algorithm that jointly learns good initialization parameters (prior knowledge) and layer-wise parameter subspaces in the form of operation subsets that should be adaptable. In this way, SAP can learn which operation subsets to adjust with gradient descent based on the underlying task distribution, simultaneously decreasing the risk of overfitting when learning new tasks. We demonstrate that this ability is helpful as SAP yields superior or competitive performance in few-shot image classification settings (gains between 0.1% and 3.9% in accuracy). Analysis of the learned subspaces demonstrates that low-dimensional operations often yield high activation strengths, indicating that they may be important for achieving good few-shot learning performance. For reproducibility purposes, we publish all our research code publicly.
翻译:基于梯度的元学习技术旨在从一组训练任务中提取有用的先验知识,以便通过梯度下降更高效地学习新任务。尽管这些方法在多种场景下取得了成功,但在学习新任务时,它们通常对可训练层的所有参数进行自适应调整,这忽略了针对特定任务分布可能存在的更高效的学习策略,并且容易过拟合——尤其是在小样本学习场景中,任务必须从有限数量的样本中学习。为解决这些问题,我们提出了子空间自适应先验(Subspace Adaptation Prior, SAP),这是一种新颖的基于梯度的元学习算法,能够联合学习良好的初始化参数(先验知识)以及以可自适应操作子集形式存在的逐层参数子空间。通过这种方式,SAP能够基于底层任务分布学习应当通过梯度下降调整哪些操作子集,同时降低学习新任务时的过拟合风险。实验表明,这种能力在小样本图像分类设置中(准确率提升0.1%至3.9%)有助于SAP取得更优或具有竞争力的性能。对学习到的子空间进行分析表明,低维操作通常具有较高的激活强度,暗示它们可能对实现良好的小样本学习性能至关重要。为保障可复现性,我们公开了全部研究代码。