Large-scale pre-trained language models (PLMs) bring new opportunities to challenging problems, especially those that need high-level intelligence, such as the math word problem (MWPs). However, directly applying existing PLMs to MWPs can fail as the generation process lacks sufficient supervision and thus lacks fast adaptivity as humans. We notice that human reasoning has a dual reasoning framework that consists of an immediate reaction system (system 1) and a delicate reasoning system (system 2), where the entire reasoning is determined by their interaction. This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier. In our approach, the generator is responsible for generating reasoning paths, and the verifiers are used to supervise the evaluation in order to obtain reliable feedback for the generator. We evaluate our CoRe framework on several mathematical reasoning datasets and achieve decent improvement over state-of-the-art methods, up to 9.6% increase over best baselines. Our codes are available at https://github.com/TianHongZXY/CoRe
翻译:大规模预训练语言模型(PLMs)为具有挑战性的问题(尤其需要高级智能的问题,如数学应用题(MWPs))带来了新的机遇。然而,直接将现有PLMs应用于MWPs可能会失败,因为生成过程缺乏足够的监督,从而缺乏人类般的快速适应性。我们注意到,人类推理具有双推理框架,包括即时反应系统(系统1)和精细推理系统(系统2),整个推理过程由它们的交互决定。这启发我们开发了一种用于解决MWPs的合作推理诱导PLM,称为合作推理(CoRe),构建了一种类人推理架构,其中系统1作为生成器,系统2作为验证器。在我们的方法中,生成器负责生成推理路径,而验证器用于监督评估,以便为生成器获取可靠的反馈。我们在多个数学推理数据集上评估了CoRe框架,并在最先进方法的基础上实现了可观的改进,相较于最佳基线最高提升了9.6%。我们的代码可从https://github.com/TianHongZXY/CoRe获取。