In this paper, we report our methods and experiments for the TREC Conversational Assistance Track (CAsT) 2022. In this work, we aim to reproduce multi-stage retrieval pipelines and explore one of the potential benefits of involving mixed-initiative interaction in conversational passage retrieval scenarios: reformulating raw queries. Before the first ranking stage of a multi-stage retrieval pipeline, we propose a mixed-initiative query reformulation module, which achieves query reformulation based on the mixed-initiative interaction between the users and the system, as the replacement for the neural reformulation method. Specifically, we design an algorithm to generate appropriate questions related to the ambiguities in raw queries, and another algorithm to reformulate raw queries by parsing users' feedback and incorporating it into the raw query. For the first ranking stage of our multi-stage pipelines, we adopt a sparse ranking function: BM25, and a dense retrieval method: TCT-ColBERT. For the second-ranking step, we adopt a pointwise reranker: MonoT5, and a pairwise reranker: DuoT5. Experiments on both TREC CAsT 2021 and TREC CAsT 2022 datasets show the effectiveness of our mixed-initiative-based query reformulation method on improving retrieval performance compared with two popular reformulators: a neural reformulator: CANARD-T5 and a rule-based reformulator: historical query reformulator(HQE).
翻译:本文报告了我们参与TREC会话辅助赛道(CAsT)2022的方法与实验。本研究旨在复现多阶段检索流水线,并探索在会话式段落检索场景中引入混合主动交互的潜在优势之一:原始查询重构。在多阶段检索流水线的第一排序阶段之前,我们提出了一种混合主动查询重构模块,该模块基于用户与系统之间的混合主动交互实现查询重构,以替代神经重构方法。具体而言,我们设计了一种算法来生成与原始查询中歧义相关的适当问题,另一种算法则通过解析用户反馈并将其融入原始查询来重构查询。在多阶段流水线的第一排序阶段,我们采用了稀疏排序函数BM25和密集检索方法TCT-ColBERT;在第二排序阶段则采用了逐点重排序器MonoT5和成对重排序器DuoT5。在TREC CAsT 2021和TREC CAsT 2022数据集上的实验表明,与两种主流重构器——神经重构器CANARD-T5和基于规则的历史查询重构器(HQE)相比,我们的基于混合主动的查询重构方法能有效提升检索性能。