Modern learning systems excel at interpolation but struggle to generalize to unseen tasks outside the training distribution's support. This failure occurs even in simple settings, such as handling task parameters beyond the training range, and persists despite advances in foundation models. To this end, we develop the Relational Task Extrapolator (RTE), an algorithm designed to enable systematic extrapolation to novel tasks. The key observation is that extrapolation is inherently relational: extrapolating to unseen tasks requires learning how tasks transform into one another. If a model learns the transformation between tasks A and B during training, it can apply that same transformation to relate known tasks to unseen ones at test time. RTE operationalizes this idea by decomposing each target task into a known anchor task and a transformation linking the anchor and target. It then learns a relational operator, mapping an anchor-transformation pair to predictions for the target task. We instantiate RTE across multiple task extrapolation regimes in function prediction, e.g. where target tasks use out-of-range parameters (parameter extrapolation), have greater compositional depth (length extrapolation), and/or recombine function primitives in unseen ways (compositional extrapolation). We further extend RTE to sequence prediction, integrating it into fine-tuning algorithms for foundation models. Across empirical studies, we find that RTE substantially outperforms existing approaches on extrapolation to novel, unseen tasks.
翻译:现代学习系统在插值任务中表现出色,但在处理训练分布支撑范围之外的未见任务时却难以泛化。这种失败即使在简单场景中(如处理超出训练范围的任务参数)也会发生,并且尽管基础模型取得了进展,问题依然存在。为此,我们开发了关系型任务外推器(RTE),这是一种旨在实现系统性外推至新任务的算法。关键观察在于:外推本质上是关系性的——外推至未见任务需要学习任务之间如何相互转换。若模型在训练过程中学会了任务A与任务B之间的变换,它便能在测试时应用相同变换将已知任务关联至未见任务。RTE通过将每个目标任务分解为已知锚点任务和连接锚点与目标的变换来实践这一思想,进而学习一个关系算子,将锚点-变换对映射至目标任务的预测。我们在函数预测的多种任务外推场景中实例化RTE,例如目标任务使用范围外参数(参数外推)、具备更深组合深度(长度外推)或以未见方式重组函数原语(组合外推)。我们进一步将RTE扩展至序列预测,将其集成至基础模型的微调算法中。跨经验研究表明,RTE在面向未见新任务的外推中显著优于现有方法。