Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL), where a few examples are used to describe a task to the model. However, the performance of ICL varies significantly with the choice of demonstrations, and it is still unclear why this happens or what factors will influence its choice. In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent. We further proposed a data- and model-dependent demonstration selection method, \textbf{TopK + ConE}, based on the assumption that \textit{the performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples}, resulting in a simple and effective recipe for ICL. Empirically, our method yields consistent improvements in both language understanding and generation tasks with different model scales. Further analyses confirm that, besides the generality and stability under different circumstances, our method provides a unified explanation for the effectiveness of previous methods. Code will be released.
翻译:大型语言模型(LLMs)展现出令人印象深刻的能力,能够通过上下文学习(ICL)执行广泛的任务,其中通过少量示例向模型描述任务。然而,ICL的性能随示例选择的变化而显著不同,且目前尚不清楚这种现象的原因或哪些因素会影响其选择。在本工作中,我们首先从数据与模型两方面重新审视导致这种差异的因素,发现示例的选择既依赖于数据也依赖于模型。我们进一步提出了一种数据与模型相关的示例选择方法——\textbf{TopK + ConE},该方法基于如下假设:\textit{示例的性能与其对模型理解测试样本的贡献呈正相关},从而为ICL提供了一种简单有效的方案。实验表明,我们的方法在不同规模模型的语言理解与生成任务上均能带来一致的改进。进一步分析证实,除了在不同场景下的通用性和稳定性外,我们的方法还为先前方法的有效性提供了统一的解释。代码将公开发布。