The increasing adoption of natural language processing (NLP) models across industries has led to practitioners' need for machine learning systems to handle these models efficiently, from training to serving them in production. However, training, deploying, and updating multiple models can be complex, costly, and time-consuming, mainly when using transformer-based pre-trained language models. Multi-Task Learning (MTL) has emerged as a promising approach to improve efficiency and performance through joint training, rather than training separate models. Motivated by this, we first provide an overview of transformer-based MTL approaches in NLP. Then, we discuss the challenges and opportunities of using MTL approaches throughout typical ML lifecycle phases, specifically focusing on the challenges related to data engineering, model development, deployment, and monitoring phases. This survey focuses on transformer-based MTL architectures and, to the best of our knowledge, is novel in that it systematically analyses how transformer-based MTL in NLP fits into ML lifecycle phases. Furthermore, we motivate research on the connection between MTL and continual learning (CL), as this area remains unexplored. We believe it would be practical to have a model that can handle both MTL and CL, as this would make it easier to periodically re-train the model, update it due to distribution shifts, and add new capabilities to meet real-world requirements.
翻译:随着自然语言处理(NLP)模型在各行业的广泛采用,从业者需要机器学习系统高效处理这些模型,从训练到生产环境部署。然而,训练、部署和更新多个模型可能复杂、昂贵且耗时,尤其是在使用基于Transformer的预训练语言模型时。多任务学习(MTL)已成为一种有前景的方法,通过联合训练而非分别训练模型来提高效率和性能。受此启发,我们首先概述了NLP中基于Transformer的MTL方法。接着,我们讨论了在典型机器学习生命周期各阶段使用MTL方法的挑战与机遇,特别关注数据工程、模型开发、部署和监控阶段的挑战。本综述聚焦于基于Transformer的MTL架构,据我们所知,其创新之处在于系统分析了NLP中基于Transformer的MTL如何融入机器学习生命周期各阶段。此外,我们推动了关于MTL与持续学习(CL)之间联系的研究,因为这一领域仍未得到探索。我们认为,能够同时处理MTL和CL的模型将具有实际价值,因为这有助于定期重新训练模型、应对分布偏移进行更新,并增加新能力以满足现实需求。