Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turns. However, current approaches for conversational dense retrieval primarily rely on fine-tuning a pre-trained ad-hoc retriever using the whole conversational search session, which can be lengthy and noisy. Moreover, existing approaches are limited by the amount of manual supervision signals in the existing datasets. To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns. Experiments on two public conversational search datasets demonstrate the improved history modeling capability of HAConvDR, in particular for long conversations with topic shifts.
翻译:会话式搜索通过支持用户与系统之间的多轮交互,促进了复杂的信息检索。支持此类交互需要全面理解会话输入,以便基于历史信息构建良好的搜索查询。具体而言,搜索查询应包含先前对话轮次中的相关信息。然而,当前会话式稠密检索的方法主要依赖于使用整个会话式搜索会话对预训练的即席检索器进行微调,这可能导致查询冗长且包含噪声。此外,现有方法受限于现有数据集中人工监督信号的数量。为解决上述问题,我们提出了一种历史感知的会话式稠密检索系统,该系统融合了两个核心理念:基于上下文去噪的查询重构,以及依据历史轮次实际影响自动挖掘监督信号。在两个公开的会话式搜索数据集上的实验表明,HAConvDR 在历史建模能力方面有所提升,尤其适用于存在话题转换的长对话场景。