Many recent theoretical works on \emph{meta-learning} aim to achieve guarantees in leveraging similar representational structures from related tasks towards simplifying a target task. Importantly, the main aim in theory works on the subject is to understand the extent to which convergence rates -- in learning a common representation -- \emph{may scale with the number $N$ of tasks} (as well as the number of samples per task). First steps in this setting demonstrate this property when both the shared representation amongst tasks, and task-specific regression functions, are linear. This linear setting readily reveals the benefits of aggregating tasks, e.g., via averaging arguments. In practice, however, the representation is often highly nonlinear, introducing nontrivial biases in each task that cannot easily be averaged out as in the linear case. In the present work, we derive theoretical guarantees for meta-learning with nonlinear representations. In particular, assuming the shared nonlinearity maps to an infinite-dimensional RKHS, we show that additional biases can be mitigated with careful regularization that leverages the smoothness of task-specific regression functions,
翻译:许多关于\emph{元学习}的近期理论研究旨在利用相关任务中的相似表示结构来简化目标任务,从而获得性能保证。这类理论工作的核心目标是理解收敛速率——在学习共同表示时——\emph{如何随任务数量$N$(以及每个任务的样本数量)扩展}。在该领域的初步研究中,当任务间的共享表示和任务特定回归函数均为线性时,这一性质得以体现。这种线性设定通过平均化等论证方式,清晰揭示了任务聚合的益处。然而在实践中,表示通常具有高度非线性,这会在每个任务中引入难以像线性情形那样通过简单平均消除的非平凡偏差。在本文中,我们推导了非线性表示下元学习的理论保证。特别地,假设共享非线性映射至无限维再生核希尔伯特空间,我们证明可通过利用任务特定回归函数的平滑性,借助精细的正则化来缓解额外偏差。