Recent advances in the development of pre-trained Spanish language models has led to significant progress in many Natural Language Processing (NLP) tasks, such as question answering. However, the lack of efficient models imposes a barrier for the adoption of such models in resource-constrained environments. Therefore, smaller distilled models for the Spanish language could be proven to be highly scalable and facilitate their further adoption on a variety of tasks and scenarios. In this work, we take one step in this direction by developing SpanishTinyRoBERTa, a compressed language model based on RoBERTa for efficient question answering in Spanish. To achieve this, we employ knowledge distillation from a large model onto a lighter model that allows for a wider implementation, even in areas with limited computational resources, whilst attaining negligible performance sacrifice. Our experiments show that the dense distilled model can still preserve the performance of its larger counterpart, while significantly increasing inference speedup. This work serves as a starting point for further research and investigation of model compression efforts for Spanish language models across various NLP tasks.
翻译:预训练西班牙语语言模型的最新进展已显著推动了诸多自然语言处理任务(如问答系统)的发展。然而,高效模型的匮乏阻碍了这些模型在资源受限环境中的部署。因此,面向西班牙语的小型蒸馏模型有望展现出高度可扩展性,并促进其在不同任务与场景中的进一步应用。本研究通过开发SpanishTinyRoBERTa——一种基于RoBERTa的压缩语言模型,致力于实现西班牙语高效问答。我们采用知识蒸馏技术,将大型模型的知识迁移至更轻量的模型,使其能够在计算资源有限的场景中实现更广泛部署,同时性能损失微乎其微。实验表明,密集蒸馏模型在保持与大型模型相当性能的同时,推理速度显著提升。本研究为探索西班牙语语言模型在各类自然语言处理任务中的压缩技术提供了基础与起点。