Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turns. However, current approaches for conversational dense retrieval primarily rely on fine-tuning a pre-trained ad-hoc retriever using the whole conversational search session, which can be lengthy and noisy. Moreover, existing approaches are limited by the amount of manual supervision signals in the existing datasets. To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns. Experiments on two public conversational search datasets demonstrate the improved history modeling capability of HAConvDR, in particular for long conversations with topic shifts.
翻译:会话搜索通过支持用户与系统之间的多轮交互,促进了复杂信息检索的实现。支持此类交互需要全面理解会话输入,从而基于历史信息构建有效的搜索查询。具体而言,搜索查询应包含此前对话轮次中的相关信息。然而,当前会话式稠密检索方法主要依赖利用完整会话搜索会话对预训练即时检索器进行微调,而完整会话可能冗长且包含噪声。此外,现有方法受到现有数据集中人工监督信号数量的限制。为解决上述问题,我们提出一种历史感知型会话式稠密检索(HAConvDR)系统,该系统融合了两项核心思想:基于上下文去噪的查询重构,以及根据历史轮次实际影响自动挖掘监督信号。在两个公开会话搜索数据集上的实验表明,HAConvDR 具有改进的历史建模能力,尤其在处理包含话题转移的长对话时表现突出。