In-context learning can improve the performances of knowledge-rich tasks such as question answering. In such scenarios, in-context examples trigger a language model (LM) to surface information stored in its parametric knowledge. We study how to better construct in-context example sets, based on whether the model is aware of the in-context examples. We identify 'known' examples, where models can correctly answer from their parametric knowledge, and 'unknown' ones. Our experiments show that prompting with 'unknown' examples decreases the performance, potentially as it encourages hallucination rather than searching for its parametric knowledge. Constructing an in-context example set that presents both known and unknown information performs the best across diverse settings. We perform analysis on three multi-answer question answering datasets, which allows us to further study answer set ordering strategies based on the LM's knowledge of each answer. Together, our study sheds light on how to best construct in-context example sets for knowledge-rich tasks.
翻译:上下文学习能够提升问答等知识密集型任务的表现。在此类场景中,上下文示例会触发语言模型从其参数知识中提取存储的信息。我们研究如何基于模型对上下文示例的认知程度来更有效地构建示例集。我们识别出"已知"示例(模型能通过参数知识正确回答的示例)与"未知"示例。实验表明,使用"未知"示例进行提示会降低模型性能,这可能是由于它更倾向于引发幻觉而非搜索参数知识。构建同时包含已知与未知信息的上下文示例集在多种设置下表现最佳。我们在三个多答案问答数据集上进行分析,从而能够进一步基于语言模型对每个答案的认知程度研究答案集合的排序策略。总体而言,本研究揭示了如何为知识密集型任务最优化构建上下文示例集的方法。