No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods

Given the ubiquity of multi-task in practical systems, Multi-Task Learning (MTL) has found widespread application across diverse domains. In real-world scenarios, these tasks often have different priorities. For instance, In web search, relevance is often prioritized over other metrics, such as click-through rates or user engagement. Existing frameworks pay insufficient attention to the prioritization among different tasks, which typically adjust task-specific loss function weights to differentiate task priorities. However, this approach encounters challenges as the number of tasks grows, leading to exponential increases in hyper-parameter tuning complexity. Furthermore, the simultaneous optimization of multiple objectives can negatively impact the performance of high-priority tasks due to interference from lower-priority tasks. In this paper, we introduce a novel multi-task learning framework employing Lagrangian Differential Multiplier Methods for step-wise multi-task optimization. It is designed to boost the performance of high-priority tasks without interference from other tasks. Its primary advantage lies in its ability to automatically optimize multiple objectives without requiring balancing hyper-parameters for different tasks, thereby eliminating the need for manual tuning. Additionally, we provide theoretical analysis demonstrating that our method ensures optimization guarantees, enhancing the reliability of the process. We demonstrate its effectiveness through experiments on multiple public datasets and its application in Taobao search, a large-scale industrial search ranking system, resulting in significant improvements across various business metrics.

翻译：鉴于多任务在实际系统中的普遍性，多任务学习（MTL）已在众多领域得到广泛应用。在现实场景中，这些任务通常具有不同的优先级。例如，在网页搜索中，相关性指标往往优先于其他指标（如点击率或用户参与度）。现有框架对不同任务间的优先级关注不足，通常仅通过调整任务特定损失函数的权重来区分任务优先级。然而，随着任务数量增加，这种方法面临超参数调优复杂度呈指数级增长的挑战。此外，多目标的同时优化可能因低优先级任务的干扰而对高优先级任务的性能产生负面影响。本文提出一种采用拉格朗日微分乘子法进行分步多任务优化的新型多任务学习框架。该框架旨在提升高优先级任务的性能，且不受其他任务干扰。其主要优势在于能够自动优化多目标，无需为不同任务设置平衡超参数，从而免除了人工调参的需求。此外，我们通过理论分析证明该方法能确保优化收敛性，提升了流程的可靠性。我们在多个公共数据集上验证了其有效性，并将其应用于淘宝搜索这一大规模工业级搜索排序系统，在多项业务指标上均取得了显著提升。