Entropy-based inference methods have gained traction for improving the reliability of Large Language Models (LLMs). However, many existing approaches, such as entropy minimization techniques, suffer from high computational overhead and fail to leverage historical token context effectively. To address these limitations, we propose Spectral Logit Sculpting (SLS), a lightweight inference-time optimization method that dynamically modulates token distributions using spectral and entropic properties of recent logits. SLS maintains a sliding buffer of top-K logits, performs on-the-fly Singular Value Decomposition (SVD) to identify dominant spectral directions, and adaptively rescales logits based on both entropy and logit gap statistics--only activating when uncertainty is high. Without updating any model parameters, SLS effectively sharpens the output distribution while preserving contextual consistency. Experimental results on multiple public benchmarks demonstrate that SLS consistently outperforms existing baseline methods, achieving superior accuracy in mathematical, coding, and scientific reasoning tasks.
翻译:基于熵的推理方法在提升大型语言模型可靠性方面日益受到关注。然而,现有许多方法(如熵最小化技术)存在计算开销高、未能有效利用历史标记上下文的问题。为克服这些局限,本文提出谱系逻辑塑形,一种轻量级推理时优化方法,该方法利用近期逻辑值的谱特性与熵特性动态调节标记分布。SLS维护一个Top-K逻辑值的滑动缓冲区,执行实时奇异值分解以识别主导谱方向,并基于熵与逻辑值间隙统计量自适应地重新缩放逻辑值——仅在高不确定性时激活。在不更新任何模型参数的情况下,SLS能有效锐化输出分布,同时保持上下文一致性。在多个公开基准测试上的实验结果表明,SLS持续优于现有基线方法,在数学、编程及科学推理任务中实现了更优的准确率。