Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks. Occasionally, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, which is known as negative transfer. This problem is often attributed to the gradient conflicts among tasks, and is frequently tackled by coordinating the task gradients in previous works. However, these optimization-based methods largely overlook the auxiliary-target generalization capability. To better understand the root cause of negative transfer, we experimentally investigate it from both optimization and generalization perspectives. Based on our findings, we introduce ForkMerge, a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights by minimizing target validation errors, and dynamically merges all branches to filter out detrimental task-parameter updates. On a series of auxiliary-task learning benchmarks, ForkMerge outperforms existing methods and effectively mitigates negative transfer.
翻译:辅助任务学习(ATL)旨在通过利用相关任务中获得的知识来提高目标任务性能。然而,同时学习多个任务有时会导致比仅学习目标任务更低的准确率,这种现象被称为负迁移。这一问题通常归因于任务间的梯度冲突,以往的研究常通过协调任务梯度来解决。然而,这些基于优化的方法在很大程度上忽视了辅助任务与目标任务之间的泛化能力。为更深入理解负迁移的根本原因,我们分别从优化和泛化两个角度进行了实验探究。基于研究发现,我们提出ForkMerge方法,该方法定期将模型分叉为多个分支,通过最小化目标验证误差自动搜索可变的任务权重,并动态合并所有分支以滤除有害的任务参数更新。在一系列辅助任务学习基准测试中,ForkMerge优于现有方法,且有效缓解了负迁移问题。