While Chain-of-Thought (CoT) significantly enhances the performance of Large Language Models (LLMs), explicit reasoning chains introduce substantial computational redundancy. Recent latent reasoning methods attempt to mitigate this by compressing reasoning processes into latent space, but often suffer from severe performance degradation due to the lack of appropriate compression guidance. In this study, we propose Rendered CoT-Guided variational Latent Reasoning (ReGuLaR), a simple yet novel latent learning paradigm resolving this issue. Fundamentally, we formulate latent reasoning within the Variational Auto-Encoding (VAE) framework, sampling the current latent reasoning state from the posterior distribution conditioned on previous ones. Specifically, when learning this variational latent reasoning model, we render explicit reasoning chains as images, from which we extract dense visual-semantic representations to regularize the posterior distribution, thereby achieving efficient compression with minimal information loss. Extensive experiments demonstrate that ReGuLaR significantly outperforms existing latent reasoning methods across both computational efficiency and reasoning effectiveness, and even surpasses CoT through multi-modal reasoning, providing a new and insightful solution to latent reasoning. Code: https://github.com/FanmengWang/ReGuLaR.
翻译:尽管思维链(CoT)显著提升了大型语言模型(LLMs)的性能,但显式的推理链会引入大量计算冗余。近期的隐式推理方法试图通过将推理过程压缩至隐空间来缓解此问题,但由于缺乏恰当的压缩引导,往往导致严重的性能下降。本研究提出**基于渲染CoT引导的变分隐式推理(ReGuLaR)**,这是一种新颖而简洁的隐式学习范式,旨在解决该问题。其核心在于将隐式推理建模于变分自编码器(VAE)框架内,从基于先前状态的后验分布中采样当前的隐式推理状态。具体而言,在学习该变分隐式推理模型时,我们将显式推理链渲染为图像,并从中提取密集的视觉语义表征以正则化后验分布,从而在信息损失最小化的前提下实现高效压缩。大量实验表明,ReGuLaR在计算效率与推理效果上均显著优于现有隐式推理方法,甚至通过多模态推理超越了CoT,为隐式推理提供了新颖且富有洞察力的解决方案。代码:https://github.com/FanmengWang/ReGuLaR。