In-context learning (ICL) unfolds as large language models become capable of inferring test labels conditioned on a few labeled samples without any gradient update. ICL-enabled large language models provide a promising step forward toward bypassing recurrent annotation costs in a low-resource setting. Yet, only a handful of past studies have explored ICL in a cross-lingual setting, in which the need for transferring label-knowledge from a high-resource language to a low-resource one is immensely crucial. To bridge the gap, we provide the first in-depth analysis of ICL for cross-lingual text classification. We find that the prevalent mode of selecting random input-label pairs to construct the prompt-context is severely limited in the case of cross-lingual ICL, primarily due to the lack of alignment in the input as well as the output spaces. To mitigate this, we propose a novel prompt construction strategy -- Cross-lingual In-context Source-Target Alignment (X-InSTA). With an injected coherence in the semantics of the input examples and a task-based alignment across the source and target languages, X-InSTA is able to outperform random prompt selection by a large margin across three different tasks using 44 different cross-lingual pairs.
翻译:上下文学习(ICL)是指大型语言模型能够基于少量标注样本推理测试标签而无须梯度更新的能力。具备ICL能力的大型语言模型为在低资源场景下规避重复标注成本提供了有前景的路径。然而,仅有少数既往研究探索了跨语言场景中的ICL——在此场景中,将标签知识从高资源语言迁移至低资源语言的需求尤为关键。为填补这一空白,我们首次对跨语言文本分类中的ICL进行了深度分析。研究发现,当前主流的随机选择输入-标签对构建提示上下文的方式在跨语言ICL中存在严重局限,其根本原因在于输入空间与输出空间缺乏对齐性。为此,我们提出了一种新型提示构建策略——跨语言上下文源语言-目标语言对齐(X-InSTA)。通过注入输入样例语义的连贯性以及源语言与目标语言间的任务级对齐,X-InSTA在44组不同跨语言对的三个任务中,均以显著优势超越了随机提示选择方法。