Pre-trained language models (LMs) are able to perform complex reasoning without explicit fine-tuning. To understand how pre-training with a next-token prediction objective contributes to the emergence of such reasoning capability, we propose that we can view an LM as deriving new conclusions by aggregating indirect reasoning paths seen at pre-training time. We found this perspective effective in two important cases of reasoning: logic reasoning with knowledge graphs (KGs) and math reasoning with math word problems (MWPs). More specifically, we formalize the reasoning paths as random walk paths on the knowledge/reasoning graphs. Analyses of learned LM distributions suggest that a weighted sum of relevant random walk path probabilities is a reasonable way to explain how LMs reason. Experiments and analysis on multiple KG and MWP datasets reveal the effect of training on random walk paths and suggest that augmenting unlabeled random walk reasoning paths can improve real-world multi-step reasoning performance.
翻译:预训练语言模型(LMs)无需显式微调即可执行复杂推理。为理解基于下一词预测目标的大规模预训练如何催生这种推理能力,我们提出可将语言模型视为通过聚合预训练阶段观察到的间接推理路径来推导新结论。该视角在两类重要推理场景中效果显著:基于知识图谱(KGs)的逻辑推理与基于数学应用题(MWPs)的数学推理。具体而言,我们将推理路径形式化为知识/推理图上的随机游走路径。对学习到的语言模型分布的分析表明,相关随机游走路径概率的加权求和是解释LM推理机制的合理方式。在多个知识图谱和数学应用题数据集上的实验与分析揭示了随机游走路径训练的影响,并表明通过增强未标注的随机游走推理路径可提升实际多步推理性能。