Recently, the mysterious In-Context Learning (ICL) ability exhibited by Transformer architectures, especially in large language models (LLMs), has sparked significant research interest. However, the resilience of Transformers' in-context learning capabilities in the presence of noisy samples, prevalent in both training corpora and prompt demonstrations, remains underexplored. In this paper, inspired by prior research that studies ICL ability using simple function classes, we take a closer look at this problem by investigating the robustness of Transformers against noisy labels. Specifically, we first conduct a thorough evaluation and analysis of the robustness of Transformers against noisy labels during in-context learning and show that they exhibit notable resilience against diverse types of noise in demonstration labels. Furthermore, we delve deeper into this problem by exploring whether introducing noise into the training set, akin to a form of data augmentation, enhances such robustness during inference, and find that such noise can indeed improve the robustness of ICL. Overall, our fruitful analysis and findings provide a comprehensive understanding of the resilience of Transformer models against label noises during ICL and provide valuable insights into the research on Transformers in natural language processing. Our code is available at https://github.com/InezYu0928/in-context-learning.
翻译:近年来,Transformer架构(特别是在大型语言模型中)展现出的神秘上下文学习能力引发了广泛研究兴趣。然而,当训练语料与提示示例中普遍存在嘈杂样本时,Transformer上下文学习能力的鲁棒性仍未得到充分探索。本文受先前使用简单函数类研究上下文学习能力的方法启发,通过考察Transformer对嘈杂标签的鲁棒性来深入探究该问题。具体而言,我们首先全面评估并分析了Transformer在上下文学习过程中对嘈杂标签的鲁棒性,发现其对演示标签中多种噪声类型均表现出显著抗干扰能力。此外,我们进一步探究在训练集中引入噪声(类似数据增强手段)是否能够增强推理阶段的鲁棒性,实验结果表明此类噪声确实能提升上下文学习的鲁棒性。总体而言,我们丰富的分析与发现为理解Transformer模型在上下文学习过程中对标签噪声的鲁棒性提供了全景式认知,并为自然语言处理领域的Transformer研究提供了宝贵见解。相关代码已开源至:https://github.com/InezYu0928/in-context-learning。