Recent years have seen a rapid surge in research leveraging Large Language Models (LLMs) for recommendation. These methods typically employ supervised fine-tuning (SFT) to adapt LLMs to recommendation scenarios, and utilize beam search during inference to efficiently retrieve $B$ top-ranked recommended items. However, we identify a critical training-inference inconsistency: while SFT optimizes the overall probability of positive items, it does not guarantee that such items will be retrieved by beam search even if they possess high overall probabilities. Due to the greedy pruning mechanism, beam search can prematurely discard a positive item once its prefix probability is insufficient. To address this inconsistency, we propose BEAR (Beam-SEarch-Aware Regularization), a novel fine-tuning objective that explicitly accounts for beam search behavior during training. Rather than directly simulating beam search for each instance during training, which is computationally prohibitive, BEAR enforces a relaxed necessary condition: each token in a positive item must rank within the top-$B$ candidate tokens at each decoding step. This objective effectively mitigates the risk of incorrect pruning while incurring negligible computational overhead compared to standard SFT. Extensive experiments across four real-world datasets demonstrate that BEAR significantly outperforms strong baselines. Code is available at https://github.com/Tiny-Snow/BEAR-SIGIR-2026 .
翻译:近年来,利用大语言模型进行推荐的研究迅速兴起。这些方法通常采用监督微调来使大语言模型适应推荐场景,并在推理阶段使用束搜索高效检索前 $B$ 个排名最高的推荐项。然而,我们识别出一个关键的训练-推理不一致问题:尽管监督微调优化了正样本的整体概率,但这并不能保证这些项目在束搜索中被检索到——即使它们拥有较高的整体概率。由于束搜索的贪婪剪枝机制,一旦某个正样本的前缀概率不足,便可能被过早丢弃。为解决这一不一致性,我们提出BEAR(束搜索感知正则化),一种新颖的微调目标,在训练中显式考虑束搜索行为。BEAR并未在训练时为每个实例直接模拟束搜索(这在计算上代价高昂),而是施加一个松弛的必要条件:在每一步解码中,正样本中的每个标记必须排在前 $B$ 个候选标记之内。该目标能有效降低错误剪枝的风险,同时相较于标准监督微调仅增加可忽略的计算开销。在四个真实数据集上的广泛实验表明,BEAR显著优于强基线方法。代码已开源:https://github.com/Tiny-Snow/BEAR-SIGIR-2026。