Generative recommendation (GR) typically first quantizes continuous item embeddings into multi-level semantic IDs (SIDs), and then generates the next item via autoregressive decoding. Although existing methods are already competitive in terms of recommendation performance, directly inheriting the autoregressive decoding paradigm from language models still suffers from three key limitations: (1) autoregressive decoding struggles to jointly capture global dependencies among the multi-dimensional features associated with different positions of SID; (2) using a unified, fixed decoding path for the same item implicitly assumes that all users attend to item attributes in the same order; (3) autoregressive decoding is inefficient at inference time and struggles to meet real-time requirements. To tackle these challenges, we propose MDGR, a Masked Diffusion Generative Recommendation framework that reshapes the GR pipeline from three perspectives: codebook, training, and inference. (1) We adopt a parallel codebook to provide a structural foundation for diffusion-based GR. (2) During training, we adaptively construct masking supervision signals along both the temporal and sample dimensions. (3) During inference, we develop a warm-up-based two-stage parallel decoding strategy for efficient generation of SIDs. Extensive experiments on multiple public and industrial-scale datasets show that MDGR outperforms ten state-of-the-art baselines by up to 10.78%. Furthermore, by deploying MDGR on a large-scale online advertising platform, we achieve a 1.20% increase in revenue, demonstrating its practical value.
翻译:生成式推荐(GR)通常首先将连续的项目嵌入量化为多级语义ID(SID),然后通过自回归解码生成下一个项目。尽管现有方法在推荐性能方面已经具有竞争力,但直接从语言模型继承自回归解码范式仍然存在三个关键局限性:(1)自回归解码难以联合捕获与SID不同位置相关联的多维特征之间的全局依赖关系;(2)对同一项目使用统一、固定的解码路径隐含地假设所有用户都以相同顺序关注项目属性;(3)自回归解码在推理时效率低下,难以满足实时性要求。为应对这些挑战,我们提出了MDGR,一个掩码扩散生成式推荐框架,该框架从三个角度重塑了GR流程:码本、训练和推理。(1)我们采用并行码本,为基于扩散的GR提供结构基础。(2)在训练期间,我们沿时间和样本维度自适应地构建掩码监督信号。(3)在推理期间,我们开发了一种基于预热的两阶段并行解码策略,用于高效生成SID。在多个公开和工业规模数据集上的大量实验表明,MDGR优于十个最先进的基线方法,性能提升最高达10.78%。此外,通过在一个大规模在线广告平台上部署MDGR,我们实现了1.20%的收入增长,证明了其实用价值。