Performance prediction is a method to estimate the performance of multilingual language models (LMs), mitigating computational costs associated with model capacity and data for fine-tuning. Our paper introduces ProxyLM, a scalable framework for predicting LM performance using proxy models in multilingual tasks. These proxy models act as surrogates, approximating the performance of fine-tuned LMs on specific downstream natural language processing (NLP) tasks. By leveraging proxy models, ProxyLM significantly reduces computational overhead on task evaluations, achieving up to a 37.08x speedup compared to traditional methods, even with our smallest proxy models. Additionally, our methodology showcases adaptability to previously unseen languages in pre-trained LMs, outperforming the state-of-the-art performance by 1.89x as measured by root-mean-square-error (RMSE). This framework streamlines model selection, enabling efficient deployment and iterative LM enhancements without extensive computational resources.
翻译:性能预测是一种用于估计多语言语言模型(LMs)性能的方法,旨在降低与模型容量和微调数据相关的计算成本。本文提出ProxyLM,一个利用代理模型预测语言模型在多语言任务中性能的可扩展框架。这些代理模型作为替代品,能够近似微调后语言模型在特定下游自然语言处理(NLP)任务上的表现。通过利用代理模型,ProxyLM显著降低了任务评估的计算开销,即使使用我们最小的代理模型,相比传统方法也能实现高达37.08倍的加速。此外,我们的方法展示了在预训练语言模型中对未见语言的适应能力,以均方根误差(RMSE)衡量,其性能优于现有最佳方法1.89倍。该框架简化了模型选择过程,使得无需大量计算资源即可实现高效部署和语言模型的迭代改进。