In-context learning (ICL) has consistently demonstrated superior performance over zero-shot performance in large language models (LLMs). However, the understanding of the dynamics of ICL and the aspects that influence downstream performance remains limited, especially for natural language generation (NLG) tasks. This work aims to address this gap by investigating the ICL capabilities of LLMs and studying the impact of different aspects of the in-context demonstrations for the task of machine translation (MT). Our preliminary investigations aim to discern whether in-context learning (ICL) is predominantly influenced by demonstrations or instructions by applying diverse perturbations to in-context demonstrations while preserving the task instruction. We observe varying behavior to perturbed examples across different model families, notably with BLOOM-7B derivatives being severely influenced by noise, whereas Llama 2 derivatives not only exhibit robustness but also tend to show enhancements over the clean baseline when subject to perturbed demonstrations. This suggests that the robustness of ICL may be governed by several factors, including the type of noise, perturbation direction (source or target), the extent of pretraining of the specific model, and fine-tuning for downstream tasks if applicable. Further investigation is warranted to develop a comprehensive understanding of these factors in future research.
翻译:上下文学习(ICL)在大语言模型(LLM)中持续展现出超越零样本性能的优势。然而,关于ICL动态机制及其对下游任务影响因素的认知仍存在局限,尤其在自然语言生成(NLG)任务中。本研究旨在填补这一空白,通过探究LLM的ICL能力,系统考察机器翻译(MT)任务中上下文示例不同维度的影响。我们通过保持任务指令不变的同时对上下文示例施加多样化扰动,初步探究ICL主要受示例还是指令驱动。观察到不同模型家族对扰动示例的响应差异显著:BLOOM-7B衍生模型对噪声表现出极强敏感性,而Llama 2衍生模型不仅展现出鲁棒性,在扰动示例作用下反而呈现优于干净基线的表现。这表明ICL鲁棒性可能受多重因素调控,包括噪声类型、扰动方向(源语言或目标语言)、特定模型的预训练程度、以及下游任务微调情况。未来研究需进一步深入分析这些因素的交互作用机制。