Identification of Negative Transfers in Multitask Learning Using Surrogate Models

Multitask learning is widely used in practice to train a low-resource target task by augmenting it with multiple related source tasks. Yet, naively combining all the source tasks with a target task does not always improve the prediction performance for the target task due to negative transfers. Thus, a critical problem in multitask learning is identifying subsets of source tasks that would benefit the target task. This problem is computationally challenging since the number of subsets grows exponentially with the number of source tasks; efficient heuristics for subset selection do not always capture the relationship between task subsets and multitask learning performances. In this paper, we introduce an efficient procedure to address this problem via surrogate modeling. In surrogate modeling, we sample (random) subsets of source tasks and precompute their multitask learning performances. Then, we approximate the precomputed performances with a linear regression model that can also predict the multitask performance of unseen task subsets. We show theoretically and empirically that fitting this model only requires sampling linearly many subsets in the number of source tasks. The fitted model provides a relevance score between each source and target task. We use the relevance scores to perform subset selection for multitask learning by thresholding. Through extensive experiments, we show that our approach predicts negative transfers from multiple source tasks to target tasks much more accurately than existing task affinity measures. Additionally, we demonstrate that for several weak supervision datasets, our approach consistently improves upon existing optimization methods for multitask learning.

翻译：多任务学习在实践中被广泛用于通过增加多个相关源任务来训练低资源目标任务。然而，将所有源任务与目标任务简单结合并不总能因负迁移而提高目标任务的预测性能。因此，多任务学习中的一个关键问题是识别哪些源任务子集能对目标任务有益。由于子集数量随着源任务数量呈指数增长，该问题在计算上具有挑战性；用于子集选择的高效启发式方法并不总能准确捕捉任务子集与多任务学习性能之间的关系。本文提出了一种通过替代建模高效解决该问题的方法。在替代建模中，我们抽样（随机）源任务子集并预先计算其多任务学习性能。然后，我们使用一个线性回归模型来近似这些预先计算的性能，该模型还能预测未见任务子集的多任务性能。我们在理论和实验上证明了，拟合该模型仅需要采样与源任务数量呈线性关系的子集数量。拟合后的模型为每个源任务和目标任务提供了一个相关性分数。我们利用这些相关性分数通过阈值化进行多任务学习的子集选择。通过大量实验，我们展示了我们的方法相比现有的任务亲和力度量能更准确地预测从多个源任务到目标任务的负迁移。此外，我们证明对于多个弱监督数据集，我们的方法持续优于现有的多任务学习优化方法。