Different conflicting optimization criteria arise naturally in various Deep Learning scenarios. These can address different main tasks (i.e., in the setting of Multi-Task Learning), but also main and secondary tasks such as loss minimization versus sparsity. The usual approach is a simple weighting of the criteria, which formally only works in the convex setting. In this paper, we present a Multi-Objective Optimization algorithm using a modified Weighted Chebyshev scalarization for training Deep Neural Networks (DNNs) with respect to several tasks. By employing this scalarization technique, the algorithm can identify all optimal solutions of the original problem while reducing its complexity to a sequence of single-objective problems. The simplified problems are then solved using an Augmented Lagrangian method, enabling the use of popular optimization techniques such as Adam and Stochastic Gradient Descent, while efficaciously handling constraints. Our work aims to address the (economical and also ecological) sustainability issue of DNN models, with a particular focus on Deep Multi-Task models, which are typically designed with a very large number of weights to perform equally well on multiple tasks. Through experiments conducted on two Machine Learning datasets, we demonstrate the possibility of adaptively sparsifying the model during training without significantly impacting its performance, if we are willing to apply task-specific adaptations to the network weights. Code is available at https://github.com/salomonhotegni/MDMTN.
翻译:在深度学习的各种场景中,自然会涌现出多个相互冲突的优化准则。这些准则既可以针对不同的主要任务(即多任务学习场景),也可以涵盖主要任务与次要任务,例如损失最小化与稀疏性。通常的做法是对这些准则进行简单加权,但这种方法仅在凸优化场景中形式上有效。本文提出了一种多目标优化算法,采用改进的加权切比雪夫标量化方法来训练深度神经网络(DNN),使其能同时处理多个任务。通过运用这种标量化技术,该算法能够识别原始问题的所有最优解,同时将问题复杂度简化为一系列单目标子问题。随后采用增广拉格朗日方法求解简化后的子问题,使得Adam和随机梯度下降等主流优化技术得以应用,同时有效处理约束条件。我们的工作旨在解决DNN模型的经济与生态可持续性问题,特别关注深度多任务模型——这类模型通常设计有庞大权重数量以在多个任务上实现同等性能。通过在两个机器学习数据集上的实验,我们证明了在训练过程中自适应稀疏化模型的可行性,并且只要愿意对网络权重进行任务特定调整,就能在不显著影响性能的前提下实现这一目标。相关代码已开源至 https://github.com/salomonhotegni/MDMTN。