Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Survey

The increasing adoption of natural language processing (NLP) models across industries has led to practitioners' need for machine learning systems to handle these models efficiently, from training to serving them in production. However, training, deploying, and updating multiple models can be complex, costly, and time-consuming, mainly when using transformer-based pre-trained language models. Multi-Task Learning (MTL) has emerged as a promising approach to improve efficiency and performance through joint training, rather than training separate models. Motivated by this, we first provide an overview of transformer-based MTL approaches in NLP. Then, we discuss the challenges and opportunities of using MTL approaches throughout typical ML lifecycle phases, specifically focusing on the challenges related to data engineering, model development, deployment, and monitoring phases. This survey focuses on transformer-based MTL architectures and, to the best of our knowledge, is novel in that it systematically analyses how transformer-based MTL in NLP fits into ML lifecycle phases. Furthermore, we motivate research on the connection between MTL and continual learning (CL), as this area remains unexplored. We believe it would be practical to have a model that can handle both MTL and CL, as this would make it easier to periodically re-train the model, update it due to distribution shifts, and add new capabilities to meet real-world requirements.

翻译：随着自然语言处理(NLP)模型在各行业的广泛采用，从业者需要机器学习系统高效处理这些模型，从训练到生产环境部署。然而，训练、部署和更新多个模型可能复杂、昂贵且耗时，尤其是在使用基于Transformer的预训练语言模型时。多任务学习(MTL)已成为一种有前景的方法，通过联合训练而非分别训练模型来提高效率和性能。受此启发，我们首先概述了NLP中基于Transformer的MTL方法。接着，我们讨论了在典型机器学习生命周期各阶段使用MTL方法的挑战与机遇，特别关注数据工程、模型开发、部署和监控阶段的挑战。本综述聚焦于基于Transformer的MTL架构，据我们所知，其创新之处在于系统分析了NLP中基于Transformer的MTL如何融入机器学习生命周期各阶段。此外，我们推动了关于MTL与持续学习(CL)之间联系的研究，因为这一领域仍未得到探索。我们认为，能够同时处理MTL和CL的模型将具有实际价值，因为这有助于定期重新训练模型、应对分布偏移进行更新，并增加新能力以满足现实需求。

相关内容

多任务学习

关注 162

多任务学习（MTL）是机器学习的一个子领域，可以同时解决多个学习任务，同时利用各个任务之间的共性和差异。与单独训练模型相比，这可以提高特定任务模型的学习效率和预测准确性。多任务学习是归纳传递的一种方法，它通过将相关任务的训练信号中包含的域信息用作归纳偏差来提高泛化能力。通过使用共享表示形式并行学习任务来实现,每个任务所学的知识可以帮助更好地学习其它任务。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日