Revisiting Scalarization in Multi-Task Learning: A Theoretical Perspective

Linear scalarization, i.e., combining all loss functions by a weighted sum, has been the default choice in the literature of multi-task learning (MTL) since its inception. In recent years, there is a surge of interest in developing Specialized Multi-Task Optimizers (SMTOs) that treat MTL as a multi-objective optimization problem. However, it remains open whether there is a fundamental advantage of SMTOs over scalarization. In fact, heated debates exist in the community comparing these two types of algorithms, mostly from an empirical perspective. To approach the above question, in this paper, we revisit scalarization from a theoretical perspective. We focus on linear MTL models and study whether scalarization is capable of fully exploring the Pareto front. Our findings reveal that, in contrast to recent works that claimed empirical advantages of scalarization, scalarization is inherently incapable of full exploration, especially for those Pareto optimal solutions that strike the balanced trade-offs between multiple tasks. More concretely, when the model is under-parametrized, we reveal a multi-surface structure of the feasible region and identify necessary and sufficient conditions for full exploration. This leads to the conclusion that scalarization is in general incapable of tracing out the Pareto front. Our theoretical results partially answer the open questions in Xin et al. (2021), and provide a more intuitive explanation on why scalarization fails beyond non-convexity. We additionally perform experiments on a real-world dataset using both scalarization and state-of-the-art SMTOs. The experimental results not only corroborate our theoretical findings, but also unveil the potential of SMTOs in finding balanced solutions, which cannot be achieved by scalarization.

翻译：摘要：线性标量化，即通过加权和组合所有损失函数，自多任务学习（MTL）文献诞生以来一直是默认选择。近年来，开发将MTL视为多目标优化问题的专用多任务优化器（SMTOs）引起了广泛兴趣。然而，关于SMTOs是否比标量化更具根本优势的问题仍未解决。事实上，学术界对这两类算法的比较存在激烈争论，且多从经验角度出发。为探究上述问题，本文从理论视角重新审视标量化方法。我们聚焦于线性MTL模型，研究标量化能否充分探索Pareto前沿。研究结果表明，与近期主张标量化具有经验优势的研究相反，标量化本质上无法实现全面探索，尤其对于在多任务间实现平衡权衡的Pareto最优解。具体而言，当模型欠参数化时，我们揭示了可行区域的多曲面结构，并确定了完全探索的充要条件。这导致标量化通常无法完整勾勒Pareto前沿的结论。我们的理论结果部分解答了Xin等人（2021）提出的开放性问题，并为标量化失败（除非凸性之外）提供了更直观的解释。此外，我们使用标量化和最先进的SMTOs在真实数据集上进行了实验。实验结果不仅证实了我们的理论发现，还揭示了SMTOs在寻找标量化无法实现的平衡解方面的潜力。