Generative Recommendation has emerged as a promising paradigm, reformulating recommendation as a sequence-to-sequence generation task over hierarchical Semantic IDs. However, existing methods suffer from a critical issue we term Semantic Drift, where errors in early, high-level tokens irreversibly divert the generation trajectory into irrelevant semantic subspaces. Inspired by Process Reward Models (PRMs) that enhance reasoning in Large Language Models, we propose Promise, a novel framework that integrates dense, step-by-step verification into generative models. Promise features a lightweight PRM to assess the quality of intermediate inference steps, coupled with a PRM-guided Beam Search strategy that leverages dense feedback to dynamically prune erroneous branches. Crucially, our approach unlocks Test-Time Scaling Laws for recommender systems: by increasing inference compute, smaller models can match or surpass larger models. Extensive offline experiments and online A/B tests on a large-scale platform demonstrate that Promise effectively mitigates Semantic Drift, significantly improving recommendation accuracy while enabling efficient deployment.
翻译:生成式推荐作为一种新兴范式,将推荐任务重新定义为基于层级语义ID的序列到序列生成任务。然而,现有方法存在一个关键问题,我们称之为语义漂移,即早期高层级令牌中的错误会不可逆地将生成轨迹导向不相关的语义子空间。受过程奖励模型(PRMs)在大语言模型中提升推理能力的启发,我们提出了Promise——一种将密集的逐步验证机制集成到生成模型中的新型框架。Promise采用轻量级PRM评估中间推理步骤的质量,并结合PRM引导的束搜索策略,利用密集反馈动态剪除错误分支。关键的是,我们的方法为推荐系统解锁了测试时缩放定律:通过增加推理计算量,较小模型可以匹配甚至超越较大模型的性能。在大规模平台上进行的离线实验与在线A/B测试表明,Promise能有效缓解语义漂移,在显著提升推荐准确性的同时实现高效部署。