One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL). However, in practice, applying gradient descent (GD) on the average loss across all tasks may yield poor multitask performance due to severe under-optimization of certain tasks. Previous approaches that manipulate task gradients for a more balanced loss decrease require storing and computing all task gradients ($\mathcal{O}(k)$ space and time where $k$ is the number of tasks), limiting their use in large-scale scenarios. In this work, we introduce Fast Adaptive Multitask Optimization FAMO, a dynamic weighting method that decreases task losses in a balanced way using $\mathcal{O}(1)$ space and time. We conduct an extensive set of experiments covering multi-task supervised and reinforcement learning problems. Our results indicate that FAMO achieves comparable or superior performance to state-of-the-art gradient manipulation techniques while offering significant improvements in space and computational efficiency. Code is available at \url{https://github.com/Cranial-XIX/FAMO}.
翻译:人工智能的一个长期核心目标是创建通用型智能体,使其能够通过多任务学习从多样化数据中同时掌握多个不同任务。然而在实践中,对所有任务的平均损失应用梯度下降可能导致某些任务严重欠优化,从而造成多任务性能不佳。以往通过操纵任务梯度以实现更均衡损失降低的方法需要存储并计算所有任务梯度(空间与时间复杂度均为$\mathcal{O}(k)$,其中$k$为任务数量),这限制了其在大规模场景中的应用。本文提出快速自适应多任务优化方法FAMO,这是一种动态加权方法,能以$\mathcal{O}(1)$的空间与时间开销实现任务损失的均衡降低。我们开展了涵盖多任务监督学习与强化学习问题的广泛实验。结果表明,FAMO在达到与最先进梯度操纵技术相当甚至更优性能的同时,显著提升了空间与计算效率。代码已开源至\url{https://github.com/Cranial-XIX/FAMO}。