Question answering over hybrid contexts is a complex task, which requires the combination of information extracted from unstructured texts and structured tables in various ways. Recently, In-Context Learning demonstrated significant performance advances for reasoning tasks. In this paradigm, a large language model performs predictions based on a small set of supporting exemplars. The performance of In-Context Learning depends heavily on the selection procedure of the supporting exemplars, particularly in the case of HybridQA, where considering the diversity of reasoning chains and the large size of the hybrid contexts becomes crucial. In this work, we present Selection of ExEmplars for hybrid Reasoning (SEER), a novel method for selecting a set of exemplars that is both representative and diverse. The key novelty of SEER is that it formulates exemplar selection as a Knapsack Integer Linear Program. The Knapsack framework provides the flexibility to incorporate diversity constraints that prioritize exemplars with desirable attributes, and capacity constraints that ensure that the prompt size respects the provided capacity budgets. The effectiveness of SEER is demonstrated on FinQA and TAT-QA, two real-world benchmarks for HybridQA, where it outperforms previous exemplar selection methods.
翻译:混合上下文上的问答是一项复杂任务,需要以多种方式结合从非结构化文本和结构化表格中提取的信息。近年来,上下文学习在推理任务中展现出显著的性能提升。在该范式下,大语言模型基于少量支持示例进行预测。上下文学习的性能高度依赖于支持示例的选择过程,尤其是在混合问答场景中,考虑推理链的多样性和混合上下文的大规模性变得至关重要。本文提出面向混合推理的示例选择方法(SEER),这是一种既能保证代表性又兼顾多样性的新颖示例选择方法。SEER的核心创新在于将示例选择形式化为背包整数线性规划问题。背包框架提供了灵活性,可纳入优先考虑具有期望属性的示例的多样性约束,以及确保提示大小符合给定容量预算的容量约束。SEER的有效性在FinQA和TAT-QA这两个真实混合问答基准上得到验证,其性能优于先前的示例选择方法。