Many recent theoretical works on \emph{meta-learning} aim to achieve guarantees in leveraging similar representational structures from related tasks towards simplifying a target task. Importantly, the main aim in theory works on the subject is to understand the extent to which convergence rates -- in learning a common representation -- \emph{may scale with the number $N$ of tasks} (as well as the number of samples per task). First steps in this setting demonstrate this property when both the shared representation amongst tasks, and task-specific regression functions, are linear. This linear setting readily reveals the benefits of aggregating tasks, e.g., via averaging arguments. In practice, however, the representation is often highly nonlinear, introducing nontrivial biases in each task that cannot easily be averaged out as in the linear case. In the present work, we derive theoretical guarantees for meta-learning with nonlinear representations. In particular, assuming the shared nonlinearity maps to an infinite-dimensional RKHS, we show that additional biases can be mitigated with careful regularization that leverages the smoothness of task-specific regression functions,
翻译:许多近期关于元学习的理论研究旨在通过从相关任务中利用相似的表示结构来简化目标任务,并为此提供理论保证。这些理论工作的核心目标是理解收敛速率——在学习共同表示时——如何随任务数量 $N$(以及每个任务的样本数量)扩展。在此设定下的初步研究表明,当任务间的共享表示以及任务特定回归函数均为线性时,可展现这一特性。这种线性设定能通过平均等论证手段清晰揭示聚合任务的优势。然而实际中,表示往往高度非线性,这导致每个任务中引入不可忽视的偏差,且无法像线性情形那样通过简单平均消除。本文针对非线性表示的元学习推导了理论保证。具体而言,假设共享的非线性映射映至无限维再生核希尔伯特空间,我们证明可通过精心设计的正则化方法(利用任务特定回归函数的平滑性)来缓解额外偏差。