Modern systems for multi-hop question answering (QA) typically break questions into a sequence of reasoning steps, termed chain-of-thought (CoT), before arriving at a final answer. Often, multiple chains are sampled and aggregated through a voting mechanism over the final answers, but the intermediate steps themselves are discarded. While such approaches improve performance, they do not consider the relations between intermediate steps across chains and do not provide a unified explanation for the predicted answer. We introduce Multi-Chain Reasoning (MCR), an approach which prompts large language models to meta-reason over multiple chains of thought, rather than aggregating their answers. MCR examines different reasoning chains, mixes information between them and selects the most relevant facts in generating an explanation and predicting the answer. MCR outperforms strong baselines on 7 multi-hop QA datasets. Moreover, our analysis reveals that MCR explanations exhibit high quality, enabling humans to verify its answers.
翻译:现代多跳问答系统通常将问题分解为一系列推理步骤(称为思维链),然后得出最终答案。通常,会采样多个思维链并通过最终答案的投票机制进行聚合,但中间步骤本身被丢弃。虽然这类方法提升了性能,但未考虑不同链中间步骤之间的关系,也未为预测答案提供统一解释。我们提出多链推理方法,该方法提示大型语言模型对多条思维链进行元推理,而非聚合它们的答案。MCR检查不同的推理链,在它们之间混合信息,并选择最相关的事实以生成解释和预测答案。MCR在7个多跳问答数据集上优于强基线模型。此外,我们的分析表明MCR解释具有高质量,使人类能够验证其答案。