Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation

Conversational Question Generation (CQG) is a critical task for machines to assist humans in fulfilling their information needs through conversations. The task is generally cast into two different settings: answer-aware and answer-unaware. While the former facilitates the models by exposing the expected answer, the latter is more realistic and receiving growing attentions recently. What-to-ask and how-to-ask are the two main challenges in the answer-unaware setting. To address the first challenge, existing methods mainly select sequential sentences in context as the rationales. We argue that the conversation generated using such naive heuristics may not be natural enough as in reality, the interlocutors often talk about the relevant contents that are not necessarily sequential in context. Additionally, previous methods decide the type of question to be generated (boolean/span-based) implicitly. Modeling the question type explicitly is crucial as the answer, which hints the models to generate a boolean or span-based question, is unavailable. To this end, we present SG-CQG, a two-stage CQG framework. For the what-to-ask stage, a sentence is selected as the rationale from a semantic graph that we construct, and extract the answer span from it. For the how-to-ask stage, a classifier determines the target answer type of the question via two explicit control signals before generating and filtering. In addition, we propose Conv-Distinct, a novel evaluation metric for CQG, to evaluate the diversity of the generated conversation from a context. Compared with the existing answer-unaware CQG models, the proposed SG-CQG achieves state-of-the-art performance.

翻译：对话式问题生成（CQG）是机器通过对话协助人类满足信息需求的关键任务。该任务通常分为两种场景：答案已知与答案未知。前者通过暴露预期答案来辅助模型，而后者更为现实，近年来受到日益关注。“问什么”与“如何问”是答案未知场景中的两大挑战。针对第一个挑战，现有方法主要选取上下文中连续的句子作为推理依据。我们认为，使用此类朴素启发式生成的对话可能不够自然，因为在实际对话中，对话者常讨论上下文中非连续的关联内容。此外，先前方法隐式决定待生成的问题类型（布尔型/基于片段型）。由于答案（其中隐含了引导模型生成布尔型或基于片段型问题的线索）不可用，显式建模问题类型至关重要。为此，我们提出SG-CQG，一种两阶段CQG框架。在“问什么”阶段，从我们构建的语义图中选择一个句子作为推理依据并提取其答案片段。在“如何问”阶段，分类器在生成和筛选前通过两个显式控制信号确定问题的目标答案类型。此外，我们提出Conv-Distinct，一种用于CQG的新评估指标，以评估基于上下文生成对话的多样性。与现有答案未知的CQG模型相比，所提出的SG-CQG实现了最先进的性能。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日