Bringing Reasoning to Generative Recommendation Through the Lens of Cascaded Ranking

Generative Recommendation (GR) has become a promising end-to-end approach with high FLOPS utilization for resource-efficient recommendation. Despite the effectiveness, we show that current GR models suffer from a critical \textbf{bias amplification} issue, where token-level bias escalates as token generation progresses, ultimately limiting the recommendation diversity and hurting the user experience. By comparing against the key factor behind the success of traditional multi-stage pipelines, we reveal two limitations in GR that can amplify the bias: homogeneous reliance on the encoded history, and fixed computational budgets that prevent deeper user preference understanding. To combat the bias amplification issue, it is crucial for GR to 1) incorporate more heterogeneous information, and 2) allocate greater computational resources at each token generation step. To this end, we propose CARE, a simple yet effective cascaded reasoning framework for debiased GR. To incorporate heterogeneous information, we introduce a progressive history encoding mechanism, which progressively incorporates increasingly fine-grained history information as the generation process advances. To allocate more computations, we propose a query-anchored reasoning mechanism, which seeks to perform a deeper understanding of historical information through parallel reasoning steps. We instantiate CARE on three GR backbones. Empirical results on four datasets show the superiority of CARE in recommendation accuracy, diversity, efficiency, and promising scalability. The codes and datasets are available at https://github.com/Linxyhaha/CARE.

翻译：生成式推荐（GR）已成为一种具有高浮点运算利用率、面向资源高效推荐的端到端方法。尽管其具有有效性，我们发现当前GR模型存在严重的**偏差放大**问题，即令牌级偏差随令牌生成过程逐步加剧，最终限制了推荐多样性并损害用户体验。通过与传统多阶段流水线成功的关键因素进行对比，我们揭示了GR中可能放大偏差的两个局限性：对编码历史的同质化依赖，以及阻碍深入理解用户偏好的固定计算预算。为应对偏差放大问题，GR必须：1）融入更多异质信息；2）在每个令牌生成步骤分配更多计算资源。为此，我们提出CARE——一个简单而有效的级联推理框架，用于实现去偏的GR。为融入异质信息，我们引入渐进式历史编码机制，该机制随着生成过程的推进逐步纳入日益细粒度的历史信息。为分配更多计算资源，我们提出查询锚定推理机制，旨在通过并行推理步骤实现对历史信息的深度理解。我们在三种GR骨干网络上实例化了CARE。四个数据集上的实证结果表明，CARE在推荐准确性、多样性、效率及可扩展性方面均具有优越性。代码与数据集已发布于 https://github.com/Linxyhaha/CARE。