Large language models often fail at logical reasoning when semantic heuristics conflict with decisive evidence - a phenomenon we term cognitive traps. To address this fundamental limitation, we introduce the Deliberative Reasoning Network (DRN), a novel paradigm that reframes logical reasoning from probability maximization to uncertainty minimization. Instead of asking "Which answer is most likely?", DRN asks "Which hypothesis has the most internally consistent evidence?". DRN achieves intrinsic interpretability by explicitly tracking belief states and quantifying epistemic uncertainty for competing hypotheses through an iterative evidence synthesis process. We validate our approach through two complementary architectures - a bespoke discriminative model that embodies the core uncertainty minimization principle, and a lightweight verification module that enhances existing generative LLMs. Evaluated on LCR-1000, our new adversarial reasoning benchmark designed to expose cognitive traps, the bespoke DRN achieves up to 15.2% improvement over standard baselines. When integrated as a parameter-efficient verifier with Mistral-7B, our hybrid system boosts accuracy from 20% to 80% on the most challenging problems. Critically, DRN demonstrates strong zero-shot generalization, improving TruthfulQA performance by 23.6% without additional training, indicating that uncertainty-driven deliberation learns transferable reasoning principles. We position DRN as a foundational, verifiable System 2 reasoning component for building more trustworthy AI systems.
翻译:大型语言模型在语义启发式与决定性证据相冲突时,常出现逻辑推理失败——这一现象我们称之为认知陷阱。为应对这一根本性局限,我们提出了审慎推理网络(DRN),这是一种将逻辑推理从概率最大化重构为不确定性最小化的新型范式。DRN不再追问“哪个答案最可能成立?”,而是探究“哪个假设拥有内部最一致的证据支撑?”。该网络通过迭代式证据合成过程,显式追踪信念状态并量化竞争性假设的认知不确定性,从而实现了内在可解释性。我们通过两种互补的架构验证了该方法的有效性:一是体现核心不确定性最小化原则的定制化判别模型,二是增强现有生成式大语言模型的轻量级验证模块。在专门设计用于揭示认知陷阱的新型对抗性推理基准LCR-1000上的评估表明,定制化DRN相较于标准基线实现了最高15.2%的性能提升。当作为参数高效的验证器与Mistral-7B集成时,我们的混合系统在最具挑战性的问题上将准确率从20%提升至80%。尤为关键的是,DRN展现出强大的零样本泛化能力,在未经额外训练的情况下将TruthfulQA性能提升23.6%,这表明不确定性驱动的审慎机制能够习得可迁移的推理原则。我们将DRN定位为构建更可信人工智能系统的、具备可验证性的基础性系统2推理组件。