Recent years have seen a rapid surge in research leveraging Large Language Models (LLMs) for recommendation. These methods typically employ supervised fine-tuning (SFT) to adapt LLMs to recommendation scenarios, and utilize beam search during inference to efficiently retrieve $B$ top-ranked recommended items. However, we identify a critical training-inference inconsistency: while SFT optimizes the overall probability of positive items, it does not guarantee that such items will be retrieved by beam search even if they possess high overall probabilities. Due to the greedy pruning mechanism, beam search can prematurely discard a positive item once its prefix probability is insufficient. To address this inconsistency, we propose BEAR (Beam-SEarch-Aware Regularization), a novel fine-tuning objective that explicitly accounts for beam search behavior during training. Rather than directly simulating beam search for each instance during training, which is computationally prohibitive, BEAR enforces a relaxed necessary condition: each token in a positive item must rank within the top-$B$ candidate tokens at each decoding step. This objective effectively mitigates the risk of incorrect pruning while incurring negligible computational overhead compared to standard SFT. Extensive experiments across four real-world datasets demonstrate that BEAR significantly outperforms strong baselines. Code is available at https://github.com/Tiny-Snow/BEAR-SIGIR-2026 .
翻译:近年来,利用大语言模型进行推荐的研究迅速兴起。这些方法通常采用监督微调使大语言模型适配推荐场景,并在推理过程中使用束搜索高效检索排名前B的推荐项。然而,我们发现了一个关键的训练-推理不一致问题:监督微调虽然优化了正例项的整体概率,但无法保证这些项在束搜索中被检索到(即使它们具有较高的整体概率)。由于贪心剪枝机制,束搜索可能在正例项前缀概率不足时过早丢弃该候选项。为解决此不一致性,我们提出BEAR(束搜索感知正则化),一种在训练过程中显式考虑束搜索行为的新型微调目标函数。与在训练中为每个实例直接模拟计算代价高昂的束搜索不同,BEAR施加了一个松弛的必要条件:正例项中的每个词元在每个解码步骤中必须处于排名前B的候选词元内。该目标有效降低了错误剪枝风险,同时相较于标准监督微调仅引入可忽略的额外计算开销。在四个真实数据集上的大量实验表明,BEAR显著优于强基线方法。代码已开源:https://github.com/Tiny-Snow/BEAR-SIGIR-2026