Latent or continuous chain-of-thought methods replace explicit textual rationales with a number of internal latent steps, but these intermediate computations are difficult to evaluate beyond correlation-based probes. In this paper, we view latent chain-of-thought as a manipulable causal process in representation space by modeling latent steps as variables in a structural causal model (SCM) and analyzing their effects through step-wise $\mathrm{do}$-interventions. We study two representative paradigms (i.e., Coconut and CODI) on both mathematical and general reasoning tasks to investigate three key questions: (1) which steps are causally necessary for correctness and when answers become decidable early; (2) how does influence propagate across steps, and how does this structure compare to explicit CoT; and (3) do intermediate trajectories retain competing answer modes, and how does output-level commitment differ from representational commitment across steps. We find that latent-step budgets behave less like homogeneous extra depth and more like staged functionality with non-local routing, and we identify a persistent gap between early output bias and late representational commitment. These results motivate mode-conditional and stability-aware analyses -- and corresponding training/decoding objectives -- as more reliable tools for interpreting and improving latent reasoning systems.
翻译:潜在或连续思维链方法使用一系列内部潜在步骤替代显式的文本推理过程,但这些中间计算难以通过基于相关性的探测方法进行评估。本文通过将潜在步骤建模为结构因果模型中的变量,并借助逐步$\mathrm{do}$-干预分析其效应,将潜在思维链视为表征空间中可操纵的因果过程。我们在数学推理与通用推理任务上研究两种典型范式(即Coconut与CODI),以探究三个核心问题:(1)哪些步骤对推理正确性具有因果必要性,答案何时可被早期判定;(2)影响如何在步骤间传递,该结构与显式思维链有何异同;(3)中间轨迹是否保留竞争性答案模式,输出层承诺与表征层承诺在跨步骤过程中如何分化。研究发现,潜在步骤预算的表现更接近具有非局部路由的分阶段功能模块,而非均匀的额外深度扩展;同时我们观察到早期输出偏差与晚期表征承诺之间存在持续性差距。这些结论启示我们:以模态条件分析与稳定性感知分析——及其对应的训练/解码目标——作为解释与改进潜在推理系统的工具具有更高的可靠性。