Large language models are increasingly integrated into decision-making in areas such as healthcare, law, finance, engineering, and government. Yet they share a critical limitation: they produce fluent outputs even when their internal reasoning has drifted. A confident answer can conceal uncertainty, speculation, or inconsistency, and small changes in phrasing can lead to different conclusions. This makes LLMs useful assistants but unreliable partners in high-stakes contexts. Humans exhibit a similar weakness, often mistaking fluency for reliability. When a model responds smoothly, users tend to trust it, even when both model and user are drifting together. This paper is the first in a five-paper research series on stabilising human-AI reasoning. The series proposes a two-layer approach: Parts II-IV introduce human-side mechanisms such as uncertainty cues, conflict surfacing, and auditable reasoning traces, while Part V develops a model-side Epistemic Control Loop (ECL) that detects instability and modulates generation accordingly. Together, these layers form a missing operational substrate for governance by increasing signal-to-noise at the point of use. Stabilising interaction makes uncertainty and drift visible before enforcement is applied, enabling more precise capability governance. This aligns with emerging compliance expectations, including the EU AI Act and ISO/IEC 42001, by making reasoning processes traceable under real conditions of use. The central claim is that fluency is not reliability. Without structures that stabilise both human and model reasoning, AI cannot be trusted or governed where it matters most.
翻译:大型语言模型正日益被整合到医疗、法律、金融、工程和政府等领域的决策过程中。然而,它们存在一个关键缺陷:即使内部推理已发生偏差,仍能生成流畅输出。自信的回答可能掩盖不确定性、推测或不一致性,措辞的微小变化也常导致截然不同的结论。这使得LLM在高风险场景中成为有用的助手但不可靠的合作伙伴。人类也存在类似弱点,常将流畅性误认为可靠性。当模型流畅响应时,用户倾向于信任它,即使模型和用户都在共同偏离。本文是系列五篇论文中关于稳定人机推理的第一篇。该系列提出双层方法:第二至第四部分引入人性机制(如不确定性提示、冲突暴露和可审计推理轨迹),第五部分开发模型侧的知识控制环(ECL),用于检测不稳定性并相应调整生成。这些层共同构成了治理所需的缺失操作基础——通过在使用点提高信噪比。在强制执行之前使不确定性和偏差可视化,从而支持更精确的能力治理。这通过使推理过程在实际使用条件下可追溯,与欧盟AI法案和ISO/IEC 42001等新兴合规要求保持一致。核心论点是:流畅性并非可靠性。若无稳定人类和模型推理的架构,AI在最关键之处既不可信任,也无法被治理。