We train a language model (LM) to robustly answer multistep questions by generating and answering sub-questions. We propose Chain-of-Questions, a framework that trains a model to generate sub-questions and sub-answers one at a time by leveraging human annotated question decomposition meaning representation (QDMR). The key technical challenge is that QDMR only contains sub-questions but not answers to those sub-questions, so we treat sub-answers as latent variables and optimize them using a novel dynamic mixture of Hard-EM and MAPO. Chain-of-Questions greatly outperforms strong neuro-symbolic methods by 9.0 F1 on DROP contrast set, and outperforms GPT-3.5 by 24.3 F1 on HOTPOTQA adversarial set, thus demonstrating the effectiveness and robustness of our framework.
翻译:我们训练语言模型(LM)通过生成并回答子问题来稳健地处理多步骤问答。提出Chains-of-Questions框架,该框架利用人工标注的问题分解语义表示(QDMR),训练模型逐步生成子问题及其对应的子答案。关键技术挑战在于QDMR仅包含子问题而不包含这些子问题的答案,因此我们将子答案视为潜在变量,采用新颖的动态混合Hard-EM与MAPO算法进行优化。Chains-of-Questions在DROP对比集上以F1值9.0显著优于强神经符号方法,并在HOTPOTQA对抗集上以F1值24.3超越GPT-3.5,充分证明了本框架的有效性与鲁棒性。