Clarification questions help conversational search systems resolve ambiguous or underspecified user queries. While prior work has focused on fluency and alignment with user intent, especially through facet extraction, much less attention has been paid to grounding clarifications in the underlying corpus. Without such grounding, systems risk asking questions that cannot be answered from the available documents. We introduce RAC (Retrieval-Augmented Clarification), a framework for generating corpus-faithful clarification questions. After comparing several indexing strategies for retrieval, we fine-tune a large language model to make optimal use of research context and to encourage the generation of evidence-based question. We then apply contrastive preference optimization to favor questions supported by retrieved passages over ungrounded alternatives. Evaluated on four benchmarks, RAC demonstrate significant improvements over baselines. In addition to LLM-as-Judge assessments, we introduce novel metrics derived from NLI and data-to-text to assess how well questions are anchored in the context, and we demonstrate that our approach consistently enhances faithfulness.
翻译:澄清问题有助于对话搜索系统解决模糊或未明确指定的用户查询。尽管先前的研究主要关注流畅性以及与用户意图的对齐(特别是通过方面提取),但对将澄清问题建立在底层语料库基础上的关注却少得多。缺乏这种基础,系统可能提出无法从可用文档中回答的问题。我们提出了RAC(检索增强澄清)框架,用于生成忠实于语料库的澄清问题。在比较了多种检索索引策略后,我们对大型语言模型进行微调,以充分利用检索上下文并鼓励生成基于证据的问题。随后,我们应用对比偏好优化,使模型倾向于生成有检索段落支持的问题,而非缺乏依据的替代问题。在四个基准测试上的评估表明,RAC相较于基线模型取得了显著改进。除了使用LLM-as-Judge进行评估外,我们引入了基于自然语言推理和数据到文本生成的新型指标,以评估问题在上下文中的锚定程度,并证明我们的方法能持续提升忠实性。