Mining Intrinsic Rewards from LLM Hidden States for Efficient Best-of-N Sampling

Best-of-N sampling is a powerful method for improving Large Language Model (LLM) performance, but it is often limited by its dependence on massive, text-based reward models. These models are not only computationally expensive but also data-hungry, requiring extensive labeled datasets for training. This creates a significant data challenge, as they overlook a rich, readily available data source: the LLM's own internal hidden states. To address this data and efficiency gap, we introduce SWIFT (Simple Weighted Intrinsic Feedback Technique), a novel and lightweight method that learns a reward function directly from the rich information embedded in LLM hidden states. Operating at the token embedding level, SWIFT employs simple linear layers to effectively distinguish between preferred and dispreferred generations, eliminating the need for computationally intensive text-based modeling. Extensive experiments on standard benchmarks show that SWIFT outperforms existing baselines (12.7% higher accuracy than EurusRM-7B on MATH dataset) while using less than 0.005% of their parameters. Its robust scalability, compatibility with certain closed-source models via logit access, and ability to combine with traditional reward models for additional performance highlight SWIFT's practical value and contribution to more efficient data-driven LLM post-training. Our code is available at https://github.com/aster2024/SWIFT .

翻译：N选一采样是提升大语言模型性能的有效方法，但其性能常受限于对大规模文本奖励模型的依赖。这类模型不仅计算成本高昂，且需要大量标注数据进行训练，存在显著的数据挑战——它们忽略了一个丰富且易于获取的数据源：LLM自身的内部隐藏状态。为弥补这一数据与效率缺口，我们提出SWIFT（简易加权内在反馈技术），这是一种新颖的轻量级方法，可直接从LLM隐藏状态中嵌入的丰富信息学习奖励函数。SWIFT在词元嵌入层面运行，通过简单的线性层有效区分优选与非优选生成结果，无需依赖计算密集的文本建模。在标准基准测试上的大量实验表明，SWIFT在仅使用基线模型0.005%参数量的情况下，性能显著优于现有基线（在MATH数据集上准确率比EurusRM-7B高出12.7%）。其强大的可扩展性、通过logit访问与某些闭源模型的兼容性，以及与传统奖励模型结合进一步提升性能的能力，彰显了SWIFT的实用价值及其对构建更高效数据驱动LLM后训练方法的贡献。代码已开源：https://github.com/aster2024/SWIFT。

相关内容

Swift

关注 101

苹果公司在 WWDC 2014 开幕 Keynote 上发布的全新编程语言，具有更多现代化特性，同时容易使用，定位是补充 Objective-C. > Swift is an innovative new programming language for Cocoa and Cocoa Touch. Writing code is interactive and fun, the syntax is concise yet expressive, and apps run lightning-fast. Swift is ready for your next iOS and OS X project — or for addition into your current app — because Swift code works side-by-side with Objective-C.

Swift - Apple Developer

【ICML2025】大语言模型是自我示范预选择器

专知会员服务

12+阅读 · 2025年6月9日

【ICML2025】用于图神经网络的LLM增强方法：因果机制识别视角下的分析

专知会员服务

16+阅读 · 2025年5月14日

142页DeepSeek-R1 思维链技术：让我们一起<思考>大语言模型（LLM）的推理能力

专知会员服务

48+阅读 · 2025年4月12日

带入您自己的知识：大型语言模型（LLM）知识扩展方法综述

专知会员服务