We present our work on developing a multilingual, efficient text-to-text transformer that is suitable for handling long inputs. This model, called mLongT5, builds upon the architecture of LongT5, while leveraging the multilingual datasets used for pretraining mT5 and the pretraining tasks of UL2. We evaluate this model on a variety of multilingual summarization and question-answering tasks, and the results show stronger performance for mLongT5 when compared to existing multilingual models such as mBART or M-BERT.
翻译:我们提出了开发适用于处理长输入的多语言高效文本到文本Transformer的研究工作。该模型名为mLongT5,以LongT5架构为基础,同时利用用于预训练mT5的多语言数据集及UL2的预训练任务。我们在多种多语言摘要与问答任务上对该模型进行了评估,结果表明,与mBART或M-BERT等现有多语言模型相比,mLongT5展现出更强的性能。