RankSteer: Activation Steering for Pointwise LLM Ranking

Large language models (LLMs) have recently shown strong performance as zero-shot rankers, yet their effectiveness is highly sensitive to prompt formulation, particularly role-play instructions. Prior analyses suggest that role-related signals are encoded along activation channels that are largely separate from query-document representations, raising the possibility of steering ranking behavior directly at the activation level rather than through brittle prompt engineering. In this work, we propose RankSteer, a post-hoc activation steering framework for zero-shot pointwise LLM ranking. We characterize ranking behavior through three disentangled and steerable directions in representation space: a \textbf{decision direction} that maps hidden states to relevance scores, an \textbf{evidence direction} that captures relevance signals not directly exploited by the decision head, and a \textbf{role direction} that modulates model behavior without injecting relevance information. Using projection-based interventions at inference time, RankSteer jointly controls these directions to calibrate ranking behavior without modifying model weights or introducing explicit cross-document comparisons. Experiments on TREC DL 20 and multiple BEIR benchmarks show that RankSteer consistently improves ranking quality using only a small number of anchor queries, demonstrating that substantial ranking capacity remains under-utilized in pointwise LLM rankers. We further provide a geometric analysis revealing that steering improves ranking by stabilizing ranking geometry and reducing dispersion, offering new insight into how LLMs internally represent and calibrate relevance judgments.

翻译：大型语言模型（LLM）近期作为零样本排序器展现出强大性能，但其效果对提示构建（尤其是角色扮演指令）高度敏感。先前分析表明，角色相关信号沿激活通道编码，这些通道与查询-文档表示基本分离，这启发了在激活层面直接引导排序行为而非依赖脆弱的提示工程的可能性。本研究提出RankSteer——一种用于零样本逐点LLM排序的事后激活导向框架。我们通过表示空间中三个可解耦且可导向的方向来刻画排序行为：将隐藏状态映射为相关性分数的**决策方向**、捕捉未被决策头直接利用的相关性信号的**证据方向**，以及在不注入相关性信息前提下调节模型行为的**角色方向**。RankSteer在推理时通过基于投影的干预技术联合调控这些方向，从而在不修改模型权重或引入显式跨文档比较的情况下校准排序行为。在TREC DL 20及多个BEIR基准测试上的实验表明，RankSteer仅需少量锚点查询即可持续提升排序质量，证明逐点LLM排序器中存在大量未被充分利用的排序能力。我们进一步通过几何分析揭示：导向机制通过稳定排序几何结构并降低离散度来改进排序，这为理解LLM内部如何表示和校准相关性判断提供了新的视角。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【EMNLP2025】ReCode：基于细粒度检索增强生成的LLM代码修复方法

专知会员服务

10+阅读 · 2025年9月3日

【新书】设计大型语言模型应用：一种面向LLMs的整体方法

专知会员服务

56+阅读 · 2025年3月16日

【ICLR2025】用于大型语言模型对齐的差分隐私引导

专知会员服务

9+阅读 · 2025年1月31日

RAG+LLM=？同济大学等最新《大型语言模型的检索增强生成》综述

专知会员服务

111+阅读 · 2023年12月19日