In-context learning (ICL) ability has emerged with the increasing scale of large language models (LLMs), enabling them to learn input-label mappings from demonstrations and perform well on downstream tasks. However, under the standard ICL setting, LLMs may sometimes neglect query-related information in demonstrations, leading to incorrect predictions. To address this limitation, we propose a new paradigm called Hint-enhanced In-Context Learning (HICL) to explore the power of ICL in open-domain question answering, an important form in knowledge-intensive tasks. HICL leverages LLMs' reasoning ability to extract query-related knowledge from demonstrations, then concatenates the knowledge to prompt LLMs in a more explicit way. Furthermore, we track the source of this knowledge to identify specific examples, and introduce a Hint-related Example Retriever (HER) to select informative examples for enhanced demonstrations. We evaluate HICL with HER on 3 open-domain QA benchmarks, and observe average performance gains of 2.89 EM score and 2.52 F1 score on gpt-3.5-turbo, 7.62 EM score and 7.27 F1 score on LLaMA-2-Chat-7B compared with standard setting.
翻译:上下文学习能力随着大型语言模型规模的扩大而涌现,使模型能够从示例中学习输入-标签映射,并在下游任务中表现良好。然而,在标准上下文学习设置下,大型语言模型有时会忽略示例中与查询相关的信息,导致错误预测。为解决这一局限,我们提出一种名为“提示增强的上下文学习”(HICL)的新范式,以探索上下文学习在开放域问答(知识密集型任务的重要形式)中的潜力。HICL利用大型语言模型的推理能力从示例中提取与查询相关的知识,然后以更明确的方式将这些知识与提示串联起来,引导大型语言模型。此外,我们追踪这些知识的来源以识别特定示例,并引入一种提示相关示例检索器(HER),用于选择信息丰富的示例以增强演示效果。我们在3个开放域问答基准上对结合HER的HICL进行了评估,观察到与标准设置相比,在gpt-3.5-turbo上平均性能提升2.89 EM分数和2.52 F1分数,在LLaMA-2-Chat-7B上平均提升7.62 EM分数和7.27 F1分数。