Passage reranking is a critical task in various applications, particularly when dealing with large volumes of documents. Existing neural architectures have limitations in retrieving the most relevant passage for a given question because the semantics of the segmented passages are often incomplete, and they typically match the question to each passage individually, rarely considering contextual information from other passages that could provide comparative and reference information. This paper presents a list-context attention mechanism to augment the passage representation by incorporating the list-context information from other candidates. The proposed coarse-to-fine (C2F) neural retriever addresses the out-of-memory limitation of the passage attention mechanism by dividing the list-context modeling process into two sub-processes with a cache policy learning algorithm, enabling the efficient encoding of context information from a large number of candidate answers. This method can be generally used to encode context information from any number of candidate answers in one pass. Different from most multi-stage information retrieval architectures, this model integrates the coarse and fine rankers into the joint optimization process, allowing for feedback between the two layers to update the model simultaneously. Experiments demonstrate the effectiveness of the proposed approach.
翻译:段落重排序是各类应用中的关键任务,尤其在处理大规模文档时。现有神经架构在检索与给定问题最相关段落方面存在局限,因为切分段落的语义常不完整,且其通常将问题与每个段落单独匹配,极少考虑其他段落中能够提供比较和参考信息的上下文信息。本文提出一种列表上下文注意力机制,通过融入其他候选段落的列表上下文信息来增强段落表征。所提出的粗到细(C2F)神经检索器通过采用缓存策略学习算法,将列表上下文建模过程分解为两个子过程,从而克服了段落注意力机制的内存溢出限制,能够高效编码大量候选答案的上下文信息。该方法可普遍用于单遍编码任意数量候选答案的上下文信息。与多数多阶段信息检索架构不同,本模型将粗排序器与精排序器整合到联合优化过程中,使得两层之间能够通过反馈同时更新模型。实验证明了所提方法的有效性。