Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings. We then present an analysis suggesting that many specialized multi-task optimizers can be partly interpreted as forms of regularization, potentially explaining our surprising results. We believe our results call for a critical reevaluation of recent research in the area.
翻译:近期多任务学习研究对单一标量化方法提出了质疑,该方法仅通过最小化各任务损失之和进行训练。受多任务学习困难成因的多种假设启发,学术界提出了若干特设的多任务优化算法。这些优化器大多需要针对每个任务计算梯度,并引入显著的内存、运行时和实现开销。本研究表明,采用单任务学习中的标准正则化与稳定化技术结合的单一标量化方法,在主流监督学习和强化学习场景下,其性能可媲美或超越复杂多任务优化器。进一步分析表明,许多专用多任务优化器可部分解释为正则化形式,这或许能解释我们令人惊讶的实验结果。我们认为这些发现呼吁对该领域近期的研究进行批判性重新评估。