LLMs Are Few-Shot In-Context Low-Resource Language Learners

In-context learning (ICL) empowers large language models (LLMs) to perform diverse tasks in underrepresented languages using only short in-context information, offering a crucial avenue for narrowing the gap between high-resource and low-resource languages. Nonetheless, there is only a handful of works explored ICL for low-resource languages with most of them focusing on relatively high-resource languages, such as French and Spanish. In this work, we extensively study ICL and its cross-lingual variation (X-ICL) on 25 low-resource and 7 relatively higher-resource languages. Our study not only assesses the effectiveness of ICL with LLMs in low-resource languages but also identifies the shortcomings of in-context label alignment, and introduces a more effective alternative: query alignment. Moreover, we provide valuable insights into various facets of ICL for low-resource languages. Our study concludes the significance of few-shot in-context information on enhancing the low-resource understanding quality of LLMs through semantically relevant information by closing the language gap in the target language and aligning the semantics between the targeted low-resource and the high-resource language that the model is proficient in. Our work highlights the importance of advancing ICL research, particularly for low-resource languages. Our code is publicly released at https://github.com/SamuelCahyawijaya/in-context-alignment

翻译：上下文学习使大语言模型仅通过少量上下文信息即可在代表性不足的语言中执行多样化任务，为缩小高资源与低资源语言间的差距提供了关键途径。然而，目前仅有少数研究探索低资源语言的上下文学习，且大多聚焦于法语、西班牙语等相对高资源语言。本研究对25种低资源语言和7种相对高资源语言进行了上下文学习及其跨语言变体的系统性探究。我们的工作不仅评估了大语言模型在低资源语言中上下文学习的有效性，还揭示了上下文标签对齐的局限性，并提出更高效的替代方案：查询对齐。此外，我们为低资源语言上下文学习的多个维度提供了重要见解。研究表明，通过填补目标语言的语言鸿沟并对其与模型擅长的高资源语言进行语义对齐，少样本上下文信息能借助语义相关信息显著提升大语言模型对低资源语言的理解质量。本工作强调了推进上下文学习研究（特别是针对低资源语言）的重要性。相关代码已公开于https://github.com/SamuelCahyawijaya/in-context-alignment。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日