Although the advancements of pre-trained Large Language Models have significantly accelerated recent progress in NLP, their ever-increasing size poses significant challenges for conventional fine-tuning, especially in memory-intensive tasks. We investigate the potential of Parameter-Efficient Fine-Tuning, focusing on Low-Rank Adaptation (LoRA), in the domain of multilingual summarization, a task that is both challenging (due to typically long inputs), and relatively unexplored. We conduct an extensive study across different data availability scenarios, including high- and low-data settings, and cross-lingual transfer, leveraging models of different sizes. Our findings reveal that LoRA is competitive with full fine-tuning when trained with high quantities of data, and excels in low-data scenarios and cross-lingual transfer. We also study different strategies for few-shot cross-lingual transfer, finding that continued LoRA tuning outperforms full fine-tuning and the dynamic composition of language-specific LoRA modules.
翻译:尽管预训练大语言模型的进步显著加速了自然语言处理领域的最新进展,但其日益增长的规模对传统微调方法构成了重大挑战,特别是在内存密集型任务中。本研究探析参数高效微调的潜力,聚焦于低秩适配方法在多语言摘要领域的应用——该任务既具有挑战性(由于典型的长输入文本),且相对未经充分探索。我们针对不同数据可用性场景开展广泛研究,包括高资源与低资源设置及跨语言迁移,并利用不同规模的模型进行实验。研究结果表明,当采用大量数据训练时,低秩适配可与全参数微调相媲美,在低资源场景及跨语言迁移中表现尤为优越。我们还研究了少样本跨语言迁移的不同策略,发现持续低秩适配微调优于全参数微调及语言特异性低秩适配模块的动态组合。