In this paper, by treating in-context learning (ICL) as a meta-optimization process, we explain why LLMs are sensitive to the order of ICL examples. This understanding leads us to the development of Batch-ICL, an effective, efficient, and order-agnostic inference algorithm for ICL. Differing from the standard N-shot learning approach, Batch-ICL employs $N$ separate 1-shot forward computations and aggregates the resulting meta-gradients. These aggregated meta-gradients are then applied to a zero-shot learning to generate the final prediction. This batch processing approach renders the LLM agnostic to the order of ICL examples. Through extensive experiments and analysis, we demonstrate that Batch-ICL consistently outperforms most permutations of example sequences. In some cases, it even exceeds the performance of the optimal order for standard ICL, all while reducing the computational resources required. Furthermore, we develop a novel variant of Batch-ICL featuring multiple "epochs" of meta-optimization. This variant implicitly explores permutations of ICL examples, further enhancing ICL performance.
翻译:本文将上下文学习(ICL)视为元优化过程,解释了为何大语言模型对示例顺序敏感。基于此理解,我们提出了Batch-ICL——一种高效、有序无关的ICL推理算法。与标准N-shot学习不同,Batch-ICL采用N次独立的1-shot前向计算,并聚合生成的元梯度。这些聚合后的元梯度被应用于零样本学习,以生成最终预测。这种批处理机制使大语言模型对ICL示例顺序不敏感。通过大量实验与分析证明,Batch-ICL在多数情况下优于示例序列的排列组合,甚至在某些场景下超过标准ICL的最优顺序性能,同时降低计算资源需求。此外,我们开发了含有元优化多"周期"的Batch-ICL变体,该变体隐式探索ICL示例排列,进一步提升了性能。