Conversion objectives in large-scale recommender systems are sparse, making them difficult to optimize. Generative recommendation (GR) partially alleviates data sparsity by organizing multi-type behaviors into a unified token sequence with shared representations, but conversion signals remain insufficiently modeled. While recent behavior-aware GR models encode behavior types and employ behavior-aware attention to highlight decision-related intermediate behaviors, they still rely on standard attention over the full history and provide no additional supervision for conversions, leaving conversion sparsity largely unresolved. To address these challenges, we propose RCLRec, a reverse curriculum learning-based GR framework for sparse conversion supervision. For each conversion target, RCLRec constructs a short curriculum by selecting a subsequence of conversion-related items from the history in reverse. Their semantic tokens are fed to the decoder as a prefix, together with the target conversion tokens, under a joint generation objective. This design provides additional instance-specific intermediate supervision, alleviating conversion sparsity and focusing the model on the user's critical decision process. We further introduce a curriculum quality-aware loss to ensure that the selected curricula are informative for conversion prediction. Experiments on offline datasets and an online A/B test show that RCLRec achieves superior performance, with +2.09% advertising revenue and +1.86% orders in online deployment.
翻译:大规模推荐系统中的转化目标具有稀疏性,导致其难以优化。生成式推荐(GR)通过将多类型行为组织为统一令牌序列并共享表征,部分缓解了数据稀疏问题,但转化信号仍未被充分建模。尽管近期行为感知型GR模型对行为类型进行编码并采用行为感知注意力机制来突出与决策相关的中间行为,它们仍依赖基于完整历史序列的标准注意力机制,且未对转化过程提供额外监督,因此转化稀疏性问题尚未得到根本解决。针对上述挑战,本文提出RCLRec——一种基于反向课程学习的GR框架,用于稀疏转化监督。针对每个转化目标,RCLRec通过反向选取历史中与转化相关的物品子序列来构建短课程,将其语义令牌作为前缀与目标转化令牌共同输入解码器,并采用联合生成目标进行训练。该设计提供了额外的实例级中间监督,缓解了转化稀疏性,同时使模型聚焦于用户关键决策过程。我们进一步引入课程质量感知损失,确保所选课程包含对转化预测有价值的信息。离线数据集实验与在线A/B测试表明,RCLRec取得卓越性能,在线部署中广告收入提升2.09%,订单量提升1.86%。