Free Energy Heuristics: Fast-And-Frugal Cognition as Active Inference Under Uncertain Precision

Chain-of-thought (CoT) improves large language models' performance in math and symbolic reasoning. But on planning, contested ethics, and tasks where the model cannot check itself, more reasoning makes things worse. Both effects are documented; what has been missing is a principled account of which property decides the outcome. We argue it is meta-uncertainty: how unsure the model is about the reliability of its own evidence. When that uncertainty is high, extra reasoning stops adding signal and starts manufacturing false confidence. We prove that the policy minimizing expected free energy under uncertain precision stops integrating cues after a finite number of high-validity ones when the precision prior is heavy-tailed (Theorem 2.6.1), and under a Descending Dominance condition, is sample-wise identical to take-the-best (Theorem 2.7.4). Fast-and-frugal heuristics and active inference are, then, two descriptions of the same computation. The prediction is that on high-meta-uncertainty items, longer CoT should degrade accuracy. We score the regime per item (simulate-and-recover rho > 0.96), build FEH-79, a benchmark of Knightian frames with matched controls, and run a pre-registered study across seven models (five open-weight 3B-32B, two frontier), five CoT lengths, and 7,875 responses. The gate, fixed before any data, required a negative interaction with posterior probability above 0.95 and an accuracy drop of more than 6 points. It held. The high-regime drop is 17.3 points (95% CI [7.7, 25.5]); matched items with definite answers show no cost. The effect is regime-dependent: decisive in capable mid-to-large models, directional in the two frontier systems, absent-to-reversed in the weakest. The framework answers when CoT helps and unifies the Bayesian and fast-and-frugal traditions: less-is-more effects are evidence about the meta-uncertainty regime, not against Bayesian cognition.

翻译：链式推理（CoT）提升了大型语言模型在数学和符号推理方面的性能。但在规划、道德争议以及模型无法自我核查的任务中，更多推理反而会让情况变得更糟。这两种效应皆有文献记载，但一直缺少一个原则性解释来阐明决定结果的关键属性。我们认为这一属性是元不确定性：即模型对其自身证据可靠性的不确定程度。当这种不确定性较高时，额外的推理便不再增加信号，而是开始制造虚假的置信度。我们证明，在精度先验为厚尾分布时，最小化不确定精度下期望自由能的策略会在有限数量的高有效性线索后停止整合信息（定理2.6.1）；并且，在“递减支配”条件下，该策略在样本层面与“最优选择”策略（take-the-best）完全一致（定理2.7.4）。因此，快速节俭启发式与主动推理是同一计算过程的两种描述。由此产生的预测是：在高元不确定性的项目上，较长的链式推理会降低准确度。我们为每个项目量化该机制（仿真-恢复rho > 0.96），构建了包含匹配对照组的奈特框架（Knightian frames）基准FEH-79，并在七个模型（五个3B-32B的开源模型，两个前沿模型）、五种链式推理长度以及7,875个响应上进行了一项预注册研究。此项研究的事先判定标准（在获取任何数据前设定）要求存在后验概率超过0.95的负交互作用，且准确度下降超过6个百分点。该标准被满足。高机制区的准确度下降了17.3个百分点（95%置信区间[7.7, 25.5]）；而给出了确定答案的匹配项目则未显示成本。该效应具有机制依赖性：在能力较强的中大型模型中表现明显，在两个前沿系统中呈现趋势性，在最弱的模型中则不存在甚至出现反转。该框架回答了链式推理何时有效，并统一了贝叶斯与快速节俭两大传统：少即是多的效应提供的是关于元不确定性机制的证据，而非反对贝叶斯认知的证据。