Given a finite set of sample points, meta-learning algorithms aim to learn an optimal adaptation strategy for new, unseen tasks. Often, this data can be ambiguous as it might belong to different tasks concurrently. This is particularly the case in meta-regression tasks. In such cases, the estimated adaptation strategy is subject to high variance due to the limited amount of support data for each task, which often leads to sub-optimal generalization performance. In this work, we address the problem of variance reduction in gradient-based meta-learning and formalize the class of problems prone to this, a condition we refer to as \emph{task overlap}. Specifically, we propose a novel approach that reduces the variance of the gradient estimate by weighing each support point individually by the variance of its posterior over the parameters. To estimate the posterior, we utilize the Laplace approximation, which allows us to express the variance in terms of the curvature of the loss landscape of our meta-learner. Experimental results demonstrate the effectiveness of the proposed method and highlight the importance of variance reduction in meta-learning.
翻译:给定一组有限的样本点,元学习算法的目标是学习一种针对新未见任务的最优适应策略。通常,这些数据可能存在歧义,因为它们可能同时属于不同的任务。这在元回归任务中尤为常见。在此类情况下,由于每个任务的支持数据量有限,所估计的适应策略会面临高方差问题,这往往导致次优的泛化性能。本研究针对基于梯度的元学习中的方差缩减问题,并形式化定义了易受此影响的问题类别——我们称之为\emph{任务重叠}的条件。具体而言,我们提出了一种新方法,通过依据每个支持点在其参数后验分布上的方差进行独立加权,从而降低梯度估计的方差。为估计后验分布,我们采用拉普拉斯近似,这使得我们能够通过元学习器损失曲面的曲率来表达方差。实验结果证明了所提方法的有效性,并凸显了方差缩减在元学习中的重要性。