Fine-tuning and testing a multilingual large language model is expensive and challenging for low-resource languages (LRLs). While previous studies have predicted the performance of natural language processing (NLP) tasks using machine learning methods, they primarily focus on high-resource languages, overlooking LRLs and shifts across domains. Focusing on LRLs, we investigate three factors: the size of the fine-tuning corpus, the domain similarity between fine-tuning and testing corpora, and the language similarity between source and target languages. We employ classical regression models to assess how these factors impact the model's performance. Our results indicate that domain similarity has the most critical impact on predicting the performance of Machine Translation models.
翻译:微调并测试多语言大语言模型对于低资源语言而言成本高昂且具有挑战性。尽管先前的研究已采用机器学习方法预测自然语言处理任务的性能,但这些研究主要聚焦于高资源语言,忽视了低资源语言以及跨域的迁移。本文以低资源语言为研究对象,探究三个因素:微调语料库的规模、微调与测试语料库之间的域相似性,以及源语言与目标语言之间的语言相似性。我们采用经典回归模型评估这些因素如何影响模型的性能。结果表明,域相似性对机器翻译模型性能的预测最为关键。