UniRec: Bridging the Expressive Gap between Generative and Discriminative Recommendation via Chain-of-Attribute

Generative Recommendation (GR) reframes retrieval and ranking as autoregressive decoding over Semantic IDs (SIDs), unifying the multi-stage pipeline into a single model. Yet a fundamental expressive gap persists: discriminative models score items with direct feature access enabling explicit user-item crossing, whereas GR decodes over compact SID tokens without item-side signal. We formalize this via Bayes' theorem: ranking by p(y|f,u) is equivalent to ranking by p(f|y,u), which factorizes autoregressively over item features, showing that a generative model with full feature access matches its discriminative counterpart, with any practical gap stemming solely from incomplete feature coverage. We propose UniRec with Chain-of-Attribute (CoA) as its core mechanism. CoA prefixes each SID sequence with structured attribute tokens:category, seller, brand, before decoding the SID, recovering the item-side feature crossing that discriminative models exploit. Since items sharing identical attributes cluster in adjacent SID regions, attribute conditioning yields a measurable per-step entropy reduction H(s_k|s<k,a) < H(s_k|s<k), narrowing the search space and stabilizing beam search. We further address two deployment challenges: Capacity-constrained SID introduces exposure-weighted capacity penalties into residual quantization to suppress token collapse and the Matthew effect; Conditional Decoding Context (CDC) combines Task-Conditioned BOS with hash-based Content Summaries to inject scenario signals at each decoding step. A joint RFT and DPO framework aligns the model with business objectives beyond distribution matching. Experiments show UniRec outperforms the strongest baseline by +22.6% HR@50 overall and +15.5% on high-value orders. Deployed on Shopee's e-commerce platform, online A/B tests confirm significant gains in PVCTR (+5.37%), orders (+4.76%), and GMV (+5.60%).

翻译：摘要：生成式推荐通过将检索和排序重构为基于语义ID的自回归解码，将多阶段流水线统一为单个模型。然而，一个根本性的表达鸿沟依然存在：判别式模型通过直接特征访问对物品进行评分，从而实现显式的用户-物品交叉，而生成式推荐则仅能对紧凑的语义ID令牌进行解码，缺乏物品侧信号。我们通过贝叶斯定理将其形式化：基于p(y|f,u)的排序等价于基于p(f|y,u)的排序，后者在物品特征上分解为自回归形式，表明具有完整特征访问权限的生成式模型可与其判别式模型匹敌，任何实际差距仅源于特征覆盖的不完整性。我们提出UniRec，其核心机制为属性链。属性链在每个语义ID序列前添加结构化属性令牌（类别、卖家、品牌），再解码语义ID，从而恢复判别式模型所利用的物品侧特征交叉。由于共享相同属性的物品在相邻语义ID区域聚集，属性条件约束可实现可测的每步熵减少H(s_k|s<k,a) < H(s_k|s<k)，从而缩小搜索空间并稳定束搜索。我们进一步解决两个部署挑战：容量受限的语义ID通过引入曝光加权容量惩罚到残差量化中，以抑制令牌坍缩和马太效应；条件解码上下文通过结合基于任务条件的起始符和基于哈希的内容摘要，在每个解码步骤注入场景信号。联合RFT和DPO框架使模型对齐商业目标而非仅优化分布匹配。实验表明，UniRec在HR@50指标上整体超越最强基线22.6%，在高价值订单上提升15.5%。部署于Shopee电商平台的在线A/B测试证实了其在页面浏览量点击率（+5.37%）、订单量（+4.76%）和总商品交易额（+5.60%）上的显著提升。

相关内容

属性

关注 2

一个具体事物，总是有许许多多的性质与关系，我们把一个事物的性质与关系，都叫作事物的属性。事物与属性是不可分的，事物都是有属性的事物，属性也都是事物的属性。一个事物与另一个事物的相同或相异，也就是一个事物的属性与另一事物的属性的相同或相异。由于事物属性的相同或相异，客观世界中就形成了许多不同的事物类。具有相同属性的事物就形成一类，具有不同属性的事物就分别地形成不同的类。

【WWW2025】G-Refer：基于图检索增强的大型语言模型用于可解释推荐

专知会员服务

13+阅读 · 2025年4月8日

生成式推荐最新进展

专知会员服务

25+阅读 · 2025年1月8日

【CVPR2024】OmniViD: 一个用于通用视频理解的生成框架

专知会员服务

25+阅读 · 2024年3月27日

【UIUC博士论文】生成式深度学习：走向更好的视觉表征和多模态

专知会员服务

43+阅读 · 2024年2月2日