Multi-task learning solves multiple correlated tasks. However, conflicts may exist between them. In such circumstances, a single solution can rarely optimize all the tasks, leading to performance trade-offs. To arrive at a set of optimized yet well-distributed models that collectively embody different trade-offs in one algorithmic pass, this paper proposes to view Pareto multi-task learning through the lens of multi-task optimization. Multi-task learning is first cast as a multi-objective optimization problem, which is then decomposed into a diverse set of unconstrained scalar-valued subproblems. These subproblems are solved jointly using a novel multi-task gradient descent method, whose uniqueness lies in the iterative transfer of model parameters among the subproblems during the course of optimization. A theorem proving faster convergence through the inclusion of such transfers is presented. We investigate the proposed multi-task learning with multi-task optimization for solving various problem settings including image classification, scene understanding, and multi-target regression. Comprehensive experiments confirm that the proposed method significantly advances the state-of-the-art in discovering sets of Pareto-optimized models. Notably, on the large image dataset we tested on, namely NYUv2, the hypervolume convergence achieved by our method was found to be nearly two times faster than the next-best among the state-of-the-art.
翻译:多任务学习旨在解决多个相关任务,但任务间可能存在冲突。在此情况下,单一解难以同时优化所有任务,导致性能权衡。为在单次算法流程中获得一组既优化又分布良好的模型(这些模型共同体现不同的权衡策略),本文提出从多任务优化的视角审视帕累托多任务学习。首先将多任务学习建模为多目标优化问题,进而将其分解为多样化的无约束标量值子问题。这些子问题通过一种新颖的多任务梯度下降法协同求解,其独特性在于优化过程中子问题间模型参数的迭代迁移。本文证明了包含此类迁移可加速收敛的理论。我们通过图像分类、场景理解及多目标回归等多种问题设置,验证了所提出的多任务学习与多任务优化方法。综合实验表明,本方法在发现帕累托最优模型集方面显著提升了当前最优水平。值得注意的是,在大型图像数据集NYUv2上的测试中,本方法的超体积收敛速度比次优方法快近两倍。