Large Language models (LLMs) possess the capability to engage In-context Learning (ICL) by leveraging a few demonstrations pertaining to a new downstream task as conditions. However, this particular learning paradigm suffers from high instability stemming from substantial variances induced by factors such as the input distribution of selected examples, their ordering, and prompt formats. In this work, we demonstrate that even when all these factors are held constant, the random selection of examples still results in high variance. Consequently, we aim to explore the informative ability of data examples by quantifying the Information Gain (IG) obtained in prediction after observing a given example candidate. Then we propose to sample those with maximum IG. Additionally, we identify the presence of template bias, which can lead to unfair evaluations of IG during the sampling process. To mitigate this bias, we introduce Calibration Before Sampling strategy. The experimental results illustrate that our proposed method can yield an average relative improvement of 14.3% across six classification tasks using three LLMs.
翻译:大型语言模型(LLM)具备通过利用少量与新下游任务相关的示范示例进行上下文学习(ICL)的能力。然而,这种学习范式因所选示例的输入分布、排序及提示格式等因素导致的显著方差而存在高度不稳定性。本研究表明,即使将这些因素全部固定,随机选择示例仍会导致高方差。为此,我们旨在通过量化预测过程中观察给定示例候选所获得的信息增益(IG)来探索数据示例的信息能力,进而提出采样最大IG示例的方法。此外,我们识别出模板偏差的存在,该偏差会在采样过程中导致对IG的不公平评估。为缓解这一偏差,我们引入了采样前校准策略。实验结果表明,所提方法在使用三种大型语言模型的六类分类任务中平均相对提升达到14.3%。