We reinterpret Kant's Critique of Pure Reason as a theory of feedback stability, viewing reason as a regulator that keeps inference within the bounds of possible experience. We formalize this intuition via a composite instability index (H-Risk) combining spectral margin, conditioning, temporal sensitivity, and innovation amplification. In linear-Gaussian simulations, higher H-Risk predicts overconfident errors even under formal stability, revealing a gap between nominal and epistemic stability. Extending to large language models (LLMs), we observe preliminary correlations between internal fragility and miscalibration or hallucination (confabulation), and find that lightweight critique prompts may modestly improve or worsen calibration in small-scale tests. These results suggest a structural bridge between Kantian self-limitation and feedback control, offering a principled lens to diagnose and potentially mitigate overconfidence in reasoning systems.
翻译:我们将康德的《纯粹理性批判》重新诠释为一种反馈稳定性理论,将理性视为一种调节器,使推理保持在可能经验的界限内。我们通过一个综合不稳定性指数(H-Risk)形式化这一直觉,该指数结合了谱裕度、条件数、时间敏感性和创新放大效应。在线性高斯模拟中,较高的H-Risk即使在形式稳定性下也能预测过度自信的错误,揭示了名义稳定性与认知稳定性之间的差距。扩展到大型语言模型(LLMs),我们观察到内部脆弱性与校准失准或幻觉(虚构)之间存在初步相关性,并发现轻量级批判提示在小规模测试中可能适度改善或恶化校准。这些结果表明,康德式自我限制与反馈控制之间存在结构性桥梁,为诊断并潜在缓解推理系统中的过度自信提供了原则性视角。