Ambiguous questions persist in open-domain question answering, because formulating a precise question with a unique answer is often challenging. Previously, Min et al. (2020) have tackled this issue by generating disambiguated questions for all possible interpretations of the ambiguous question. This can be effective, but not ideal for providing an answer to the user. Instead, we propose to ask a clarification question, where the user's response will help identify the interpretation that best aligns with the user's intention. We first present CAMBIGNQ, a dataset consisting of 5,654 ambiguous questions, each with relevant passages, possible answers, and a clarification question. The clarification questions were efficiently created by generating them using InstructGPT and manually revising them as necessary. We then define a pipeline of tasks and design appropriate evaluation metrics. Lastly, we achieve 61.3 F1 on ambiguity detection and 40.5 F1 on clarification-based QA, providing strong baselines for future work.
翻译:在开放域问答中,歧义性问题始终存在,因为提出一个具有唯一答案的精确问题往往具有挑战性。此前,Min等人(2020)通过为歧义问题的所有可能解释生成消歧问题来应对这一挑战。这种方法虽有成效,但在为用户提供答案方面并非理想方案。为此,我们提出主动提出澄清性问题,借助用户的回应识别最符合其意图的语义解读。我们首先构建了CAMBIGNQ数据集,包含5,654个歧义问题及其关联文本段落、候选答案与澄清性问题。该澄清性问题通过InstructGPT生成后进行必要的人工修订,显著提升了构建效率。随后我们定义了任务流水线并设计了相应的评估指标。最终,我们的方法在歧义检测任务上达到61.3 F1值,在基于澄清的问答任务上达到40.5 F1值,为后续研究提供了强有力的基准。