Recent efforts to improve the reasoning abilities of Large Language Models (LLMs) have focused on integrating formal logic solvers within neurosymbolic frameworks. A key challenge is that formal solvers lack commonsense world knowledge, preventing them from making reasoning steps that humans find obvious. Prior methods address this by using LLMs to supply missing commonsense assumptions, but these approaches implicitly assume universal agreement on such commonsense facts. In reality, commonsense beliefs vary across individuals. We propose a probabilistic framework for abductive commonsense reasoning that explicitly models this variation, aiming to determine whether most people would judge a statement as true or false. We introduce Probabilistic Abductive CommonSense (PACS), a novel algorithm that uses an LLM and a formal solver to sample proofs as observations of individuals' distinct commonsense beliefs, and aggregates conclusions across these samples. Empirically, PACS outperforms chain-of-thought reasoning, prior neurosymbolic methods, and search-based approaches across multiple benchmarks.
翻译:近期提升大语言模型推理能力的研究聚焦于将形式逻辑求解器整合到神经符号框架中。关键挑战在于形式求解器缺乏常识性世界知识,无法执行人类认为显而易见的推理步骤。现有方法通过让大语言模型补充缺失的常识假设来解决该问题,但这些方法隐含假设此类常识事实存在普遍共识。实际上,常识信念因人而异。本文提出一种明确建模这种差异性的概率溯因常识推理框架,旨在判定多数人对某个陈述的真假判断。我们提出概率溯因常识算法(Probabilistic Abductive CommonSense, PACS),该算法利用大语言模型和形式求解器采样证明作为个体差异化常识信念的观测,并聚合这些样本的结论。实验表明,PACS在多个基准测试中均优于链式思维推理、现有神经符号方法及基于搜索的方法。