Large language models (LLMs) can use in-context demonstrations to improve performance on zero-shot tasks. However, selecting the best in-context examples is challenging because model performance can vary widely depending on the selected examples. We present a cross-entropy difference (CED) method for selecting in-context demonstrations. Our method is based on the observation that the effectiveness of in-context demonstrations negatively correlates with the perplexity of the test example by a language model that was finetuned on that demonstration. We utilize parameter efficient finetuning to train small models on training data that are used for computing the cross-entropy difference between a test example and every candidate in-context demonstration. This metric is used to rank and select in-context demonstrations independently for each test input. We evaluate our method on a mix-domain dataset that combines 8 benchmarks, representing 4 text generation tasks, showing that CED for in-context demonstration selection can improve performance for a variety of LLMs.
翻译:大型语言模型(LLMs)可利用上下文示例提升零样本任务性能。然而,由于模型性能因所选示例的不同而产生显著差异,选择最优上下文示例具有挑战性。我们提出一种基于交叉熵差异(CED)的上下文示例选择方法。该方法基于以下发现:上下文示例的有效性与该示例上微调的语言模型对测试示例的困惑度呈负相关。我们采用参数高效微调技术,在训练数据上训练小型模型,用于计算测试示例与每个候选上下文示例之间的交叉熵差异。该指标独立地对每个测试输入进行上下文示例的排序与选择。我们在融合8个基准测试的跨领域数据集上评估该方法(涵盖4种文本生成任务),结果表明基于CED的上下文示例选择能有效提升多种LLMs的性能。