Language models have been shown to perform remarkably well on a wide range of natural language processing tasks. In this paper, we propose a novel system that uses language models to perform multi-step logical reasoning. Our system incorporates explicit planning into its inference procedure, thus able to make more informed reasoning decisions at each step by looking ahead into their future effects. Moreover, we propose a training strategy that safeguards the planning process from being led astray by spurious features. Our full system significantly outperforms other competing methods on multiple standard datasets. When using a T5 model as its core component, our system performs competitively compared to GPT-3 despite having only about 1B parameters (i.e., 175 times smaller than GPT-3). When using GPT-3.5, it significantly outperforms chain-of-thought prompting on the challenging PrOntoQA dataset. We have conducted extensive empirical studies to demonstrate that explicit planning plays a crucial role in the system's performance.
翻译:语言模型已在广泛自然语言处理任务中展现出卓越性能。本文提出一种利用语言模型进行多步逻辑推理的新型系统。该系统将显式规划融入推理流程,通过前瞻性评估各步骤的潜在影响,从而在每个推理节点作出更明智的决策。此外,我们提出一种训练策略,有效防止规划过程被虚假特征误导。在多个标准数据集上,我们的完整系统显著优于其他竞争方法。当采用T5模型作为核心组件时,尽管参数量仅为约10亿(即GPT-3的1/175),其性能仍能与GPT-3比肩。当使用GPT-3.5时,系统在具有挑战性的PrOntoQA数据集上显著超越思维链提示方法。大量实证研究表明,显式规划在系统性能中发挥关键作用。