Different conflicting optimization criteria arise naturally in various Deep Learning scenarios. These can address different main tasks (i.e., in the setting of Multi-Task Learning), but also main and secondary tasks such as loss minimization versus sparsity. The usual approach is a simple weighting of the criteria, which formally only works in the convex setting. In this paper, we present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs) with respect to several tasks. By employing this scalarization technique, the algorithm can identify all optimal solutions of the original problem while reducing its complexity to a sequence of single-objective problems. The simplified problems are then solved using an Augmented Lagrangian method, enabling the use of popular optimization techniques such as Adam and Stochastic Gradient Descent, while efficaciously handling constraints. Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with a particular focus on Deep Multi-Task models, which are typically designed with a very large number of weights to perform equally well on multiple tasks. Through experiments conducted on two Machine Learning datasets, we demonstrate the possibility of adaptively sparsifying the model during training without significantly impacting its performance, if we are willing to apply task-specific adaptations to the network weights. Code is available at https://github.com/salomonhotegni/MDMTN.
翻译:在各类深度学习场景中,自然会产生多个相互冲突的优化准则。这些准则既可以对应不同的主任务(即多任务学习场景),也可以涵盖主任务与辅助任务(例如损失最小化与稀疏性之间的权衡)。常规方法是对这些准则进行简单加权,这种策略仅在凸优化场景中理论上成立。本文提出了一种基于改进加权切比雪夫标量化的多目标优化算法,用于训练面向多任务的深度神经网络(DNN)。通过应用该标量化技术,算法能够识别原始问题的所有最优解,同时将问题复杂度降低为一系列单目标子问题。随后采用增广拉格朗日方法求解简化后的子问题,使得常见的优化技术(如Adam和随机梯度下降)能够有效处理约束条件。本研究旨在解决DNN模型的经济与生态可持续性问题,特别聚焦于深度多任务模型——这类模型通常需要海量权重参数才能在各任务上达到同等性能。在两个机器学习数据集上的实验表明:若我们愿意针对网络权重施加任务特定调整,则可在训练过程中自适应地对模型进行稀疏化处理,且不会显著影响其性能。代码见https://github.com/salomonhotegni/MDMTN。