Many recent theoretical works on meta-learning aim to achieve guarantees in leveraging similar representational structures from related tasks towards simplifying a target task. Importantly, the main aim in theory works on the subject is to understand the extent to which convergence rates -- in learning a common representation -- may scale with the number $N$ of tasks (as well as the number of samples per task). First steps in this setting demonstrate this property when both the shared representation amongst tasks, and task-specific regression functions, are linear. This linear setting readily reveals the benefits of aggregating tasks, e.g., via averaging arguments. In practice, however, the representation is often highly nonlinear, introducing nontrivial biases in each task that cannot easily be averaged out as in the linear case. In the present work, we derive theoretical guarantees for meta-learning with nonlinear representations. In particular, assuming the shared nonlinearity maps to an infinite-dimensional RKHS, we show that additional biases can be mitigated with careful regularization that leverages the smoothness of task-specific regression functions,
翻译:许多关于元学习的近期理论研究致力于证明在利用相关任务的相似表示结构以简化目标任务时能够获得理论保证。重要的是,该领域理论工作的主要目标在于理解学习公共表示时的收敛速率如何随任务数量$N$(以及每个任务的样本数量)进行缩放。该研究方向的初步成果在线性场景下证明了这一特性——即当任务间共享的表示与任务特定的回归函数均为线性时。这种线性设定清晰地展示了通过聚合任务(例如采用平均化论证)带来的优势。然而在实际应用中,表示通常具有高度非线性,这会引入各任务中不可忽略的偏差,而这些偏差无法像线性情形那样通过简单平均消除。本研究针对非线性表示下的元学习推导出理论保证。特别地,在假设共享非线性映射到无限维再生核希尔伯特空间的前提下,我们证明通过精心设计的正则化方法——该方法利用任务特定回归函数的平滑性——能够有效缓解额外偏差。