Latent or continuous chain-of-thought methods replace explicit textual rationales with a number of internal latent steps, but these intermediate computations are difficult to evaluate beyond correlation-based probes. In this paper, we view latent chain-of-thought as a manipulable causal process in representation space by modeling latent steps as variables in a structural causal model (SCM) and analyzing their effects through step-wise $\mathrm{do}$-interventions. We study two representative paradigms (i.e., Coconut and CODI) on both mathematical and general reasoning tasks to investigate three key questions: (1) which steps are causally necessary for correctness and when answers become decidable early; (2) how does influence propagate across steps, and how does this structure compare to explicit CoT; and (3) do intermediate trajectories retain competing answer modes, and how does output-level commitment differ from representational commitment across steps. We find that latent-step budgets behave less like homogeneous extra depth and more like staged functionality with non-local routing, and we identify a persistent gap between early output bias and late representational commitment. These results motivate mode-conditional and stability-aware analyses -- and corresponding training/decoding objectives -- as more reliable tools for interpreting and improving latent reasoning systems. Code is available at https://github.com/J1mL1/causal-latent-cot.
翻译:潜在或连续思维链方法以若干内部潜在步骤替代显式的文本推理依据,但这些中间计算难以通过基于相关性的探测之外的方式进行评估。本文通过将潜在步骤建模为结构因果模型(SCM)中的变量,并借助逐步$\mathrm{do}$-干预分析其效应,将潜在思维链视为表征空间中可操纵的因果过程。我们在数学推理与通用推理任务上研究两种代表性范式(即Coconut与CODI),以探究三个关键问题:(1)哪些步骤对正确性具有因果必要性,以及答案何时可被早期判定;(2)影响如何跨步骤传播,其结构如何与显式思维链相比较;(3)中间轨迹是否保留竞争性答案模式,输出层承诺与跨步骤的表征承诺如何存在差异。我们发现潜在步骤预算的表现更接近具有非局部路由的分阶段功能模块,而非均匀的额外深度,并识别出早期输出偏差与晚期表征承诺之间持续存在的差距。这些结论启示了以模式条件化与稳定性感知的分析方法——及其对应的训练/解码目标——作为解释与改进潜在推理系统更可靠的技术路径。代码发布于https://github.com/J1mL1/causal-latent-cot。