Open Retrieval Conversational Question Answering (OrConvQA) answers a question given a conversation as context and a document collection. A typical OrConvQA pipeline consists of three modules: a Retriever to retrieve relevant documents from the collection, a Reranker to rerank them given the question and the context, and a Reader to extract an answer span. The conversational turns can provide valuable context to answer the final query. State-of-the-art OrConvQA systems use the same history modeling for all three modules of the pipeline. We hypothesize this as suboptimal. Specifically, we argue that a broader context is needed in the first modules of the pipeline to not miss relevant documents, while a narrower context is needed in the last modules to identify the exact answer span. We propose NORMY, the first unsupervised non-uniform history modeling pipeline which generates the best conversational history for each module. We further propose a novel Retriever for NORMY, which employs keyphrase extraction on the conversation history, and leverages passages retrieved in previous turns as additional context. We also created a new dataset for OrConvQA, by expanding the doc2dial dataset. We implemented various state-of-the-art history modeling techniques and comprehensively evaluated them separately for each module of the pipeline on three datasets: OR-QUAC, our doc2dial extension, and ConvMix. Our extensive experiments show that NORMY outperforms the state-of-the-art in the individual modules and in the end-to-end system.
翻译:摘要:开放检索式对话问答(OrConvQA)旨在基于对话上下文和文档集合回答用户问题。典型的OrConvQA流水线包含三个模块:检索器(Retriever)从文档集合中检索相关文档,重排序器(Reranker)根据问题与上下文对文档进行重排序,以及抽取器(Reader)提取答案片段。对话轮次可为最终查询提供有价值的上下文。现有最优的OrConvQA系统对所有三个模块采用相同的对话历史建模方式,我们假设这种策略并非最优。具体而言,流水线的前端模块需要更广泛的上下文以避免遗漏相关文档,而后端模块则需要更聚焦的上下文以精确识别答案片段。为此,我们提出NORMY——首个无监督的非均匀历史建模流水线,为每个模块生成最优对话历史。我们进一步为NORMY设计了新型检索器,该检索器对对话历史进行关键短语抽取,并利用前序轮次检索到的段落作为额外上下文。通过扩展doc2dial数据集,我们还构建了面向OrConvQA的新基准数据集。我们实现了多种主流历史建模技术,并在三个数据集(OR-QUAC、doc2dial扩展集及ConvMix)上分别对流水线各模块进行了系统评估。大量实验表明,NORMY在独立模块评估与端到端系统中均优于现有最优方法。