SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens

The verbosity of Chain-of-Thought (CoT) reasoning hinders its mass deployment in efficiency-critical applications. Recently, implicit CoT approaches have emerged, which encode reasoning steps within LLM's hidden embeddings (termed ``implicit reasoning'') rather than explicit tokens. This approach accelerates CoT by reducing the reasoning length and bypassing some LLM components. However, existing implicit CoT methods face two significant challenges: (1) they fail to preserve the semantic alignment between the implicit reasoning (when transformed to natural language) and the ground-truth reasoning, resulting in a significant CoT performance degradation, and (2) they focus on reducing the length of the implicit reasoning; however, they neglect the considerable time cost for an LLM to generate one individual implicit reasoning token. To tackle these challenges, we propose a novel semantically-aligned implicit CoT framework termed SemCoT. In particular, for the first challenge, we design a contrastively trained sentence transformer that evaluates semantic alignment between implicit and explicit reasoning, which is used to enforce semantic preservation during implicit reasoning optimization. To address the second challenge, we introduce an efficient implicit reasoning generator by finetuning a lightweight language model using knowledge distillation. This generator is guided by our sentence transformer to distill ground-truth reasoning into semantically aligned implicit reasoning, while also optimizing for accuracy. SemCoT is the first approach that enhances CoT efficiency by jointly optimizing token-level generation speed and preserving semantic alignment with ground-truth reasoning. Extensive experiments demonstrate the superior performance of SemCoT compared to state-of-the-art methods in both efficiency and effectiveness. Our code can be found at https://github.com/YinhanHe123/SemCoT/.

翻译：思维链（CoT）推理的冗长性阻碍了其在效率关键应用中的大规模部署。最近，隐式CoT方法应运而生，该方法将推理步骤编码于大语言模型（LLM）的隐藏嵌入中（称为“隐式推理”），而非显式标记。该方法通过缩短推理长度并绕过部分LLM组件来加速CoT。然而，现有隐式CoT方法面临两大挑战：（1）它们未能保持隐式推理（当转换为自然语言时）与真实推理之间的语义对齐，导致CoT性能显著下降；（2）它们侧重于缩短隐式推理的长度，却忽视了LLM生成单个隐式推理标记所需的可观时间成本。为应对这些挑战，我们提出了一种新颖的语义对齐隐式CoT框架，称为SemCoT。具体而言，针对第一个挑战，我们设计了一个基于对比训练的句子Transformer，用于评估隐式与显式推理间的语义对齐，该模型在隐式推理优化过程中用于强制保持语义一致性。针对第二个挑战，我们通过知识蒸馏微调轻量级语言模型，构建了一个高效的隐式推理生成器。该生成器在我们句子Transformer的引导下，将真实推理蒸馏为语义对齐的隐式推理，同时优化准确性。SemCoT是首个通过联合优化标记级生成速度与保持真实推理语义对齐来提升CoT效率的方法。大量实验表明，SemCoT在效率与效果上均优于现有最先进方法。我们的代码可见于https://github.com/YinhanHe123/SemCoT/。