Recent advances in the development of pre-trained Spanish language models has led to significant progress in many Natural Language Processing (NLP) tasks, such as question answering. However, the lack of efficient models imposes a barrier for the adoption of such models in resource-constrained environments. Therefore, smaller distilled models for the Spanish language could be proven to be highly scalable and facilitate their further adoption on a variety of tasks and scenarios. In this work, we take one step in this direction by developing SpanishTinyRoBERTa, a compressed language model based on RoBERTa for efficient question answering in Spanish. To achieve this, we employ knowledge distillation from a large model onto a lighter model that allows for a wider implementation, even in areas with limited computational resources, whilst attaining negligible performance sacrifice. Our experiments show that the dense distilled model can still preserve the performance of its larger counterpart, while significantly increasing inference speedup. This work serves as a starting point for further research and investigation of model compression efforts for Spanish language models across various NLP tasks.
翻译:预训练西班牙语语言模型的最新进展已在许多自然语言处理任务(如问答)中取得了显著进步。然而,缺乏高效模型在资源受限环境中阻碍了此类模型的推广应用。因此,针对西班牙语的更小型蒸馏模型可能具有高度可扩展性,并有助于其在各种任务和场景中的进一步应用。本研究中,我们朝这一方向迈出了一步,开发了SpanishTinyRoBERTa——一种基于RoBERTa的压缩语言模型,用于实现西班牙语的高效问答。为此,我们采用知识蒸馏技术,将大型模型的知识迁移至更轻量的模型上,使其即使在计算资源有限的区域也能广泛部署,同时性能损失极小。实验表明,密集蒸馏模型在显著提升推理速度的同时,仍能保持其大型对应模型的性能。本研究为后续西班牙语语言模型在不同NLP任务中的模型压缩探索奠定了基础。