Many recent theoretical works on \emph{meta-learning} aim to achieve guarantees in leveraging similar representational structures from related tasks towards simplifying a target task. Importantly, the main aim in theory works on the subject is to understand the extent to which convergence rates -- in learning a common representation -- \emph{may scale with the number $N$ of tasks} (as well as the number of samples per task). First steps in this setting demonstrate this property when both the shared representation amongst tasks, and task-specific regression functions, are linear. This linear setting readily reveals the benefits of aggregating tasks, e.g., via averaging arguments. In practice, however, the representation is often highly nonlinear, introducing nontrivial biases in each task that cannot easily be averaged out as in the linear case. In the present work, we derive theoretical guarantees for meta-learning with nonlinear representations. In particular, assuming the shared nonlinearity maps to an infinite-dimensional RKHS, we show that additional biases can be mitigated with careful regularization that leverages the smoothness of task-specific regression functions,
翻译:近年来许多关于*元学习*的理论研究旨在从相关任务中利用相似的表示结构来简化目标任务,并确保其有效性。这些理论工作的核心目标在于理解学习共同表示时的收敛速度在多大程度上*可能随任务数量$N$(以及每个任务的样本数量)扩展*。在该领域的初步探索中,当任务间的共享表示和任务特定回归函数均为线性时,这种性质得以证明。线性设定通过平均化等论证方式清晰揭示了任务聚合的优势。然而在实践中,表示往往高度非线性,这会在每个任务中引入难以像线性情况那样简单平均化的非平凡偏差。在本工作中,我们推导了非线性表示下元学习的理论保证。特别地,假设共享非线性映射到无限维RKHS(再生核希尔伯特空间),我们证明通过谨慎的正则化(利用任务特定回归函数的平滑性)可以缓解额外偏差的影响。