Adapting models pre-trained on large-scale datasets to a variety of downstream tasks is a common strategy in deep learning. Consequently, parameter-efficient fine-tuning methods have emerged as a promising way to adapt pre-trained models to different tasks while training only a minimal number of parameters. While most of these methods are designed for single-task adaptation, parameter-efficient training in Multi-Task Learning (MTL) architectures is still unexplored. In this paper, we introduce MTLoRA, a novel framework for parameter-efficient training of MTL models. MTLoRA employs Task-Agnostic and Task-Specific Low-Rank Adaptation modules, which effectively disentangle the parameter space in MTL fine-tuning, thereby enabling the model to adeptly handle both task specialization and interaction within MTL contexts. We applied MTLoRA to hierarchical-transformer-based MTL architectures, adapting them to multiple downstream dense prediction tasks. Our extensive experiments on the PASCAL dataset show that MTLoRA achieves higher accuracy on downstream tasks compared to fully fine-tuning the MTL model while reducing the number of trainable parameters by 3.6x. Furthermore, MTLoRA establishes a Pareto-optimal trade-off between the number of trainable parameters and the accuracy of the downstream tasks, outperforming current state-of-the-art parameter-efficient training methods in both accuracy and efficiency. Our code is publicly available.
翻译:在大规模数据集上预训练的模型适配多种下游任务是深度学习中的常见策略。因此,参数高效微调方法成为适配预训练模型至不同任务的有前景途径,仅需训练极少量参数。尽管多数方法针对单任务适配设计,但多任务学习(MTL)架构中的参数高效训练仍待探索。本文提出MTLoRA——一种面向MTL模型参数高效训练的新型框架。MTLoRA采用任务无关与任务特定低秩适配模块,有效解耦MTL微调中的参数空间,使模型能够灵活处理MTL场景中的任务专化与交互。我们将MTLoRA应用于基于层级Transformer的MTL架构,使其适配多种密集预测下游任务。在PASCAL数据集上的大量实验表明,相较完整微调MTL模型,MTLoRA在降低3.6倍可训练参数量的同时,在下游任务上取得更高精度。此外,MTLoRA在可训练参数量与下游任务精度之间建立了帕累托最优权衡,在精度与效率上均超越当前最先进的参数高效训练方法。我们的代码已公开。