From Head to Tail: Asymmetric Knowledge Transfer in Long-tail Recommendation with Generative Semantic IDs

Long-tail recommendation in real-world e-commerce platforms remains challenging due to severe data imbalance. Existing methods often struggle to combine content-based multimodal features with collaborative signals. Many of these methods also ignore an important asymmetry in knowledge transfer between head and tail IDs: noisy signals from tail IDs can hurt representation learning for head IDs. This paper presents AKT-Rec, a framework for Asymmetric Knowledge Transfer in long-tail Recommendation that uses LLM-generated semantic IDs. AKT-Rec uses Multimodal LLMs (MLLMs) with supervised fine-tuning to align content representations with collaborative information for both items and users, producing semantic representations. It then discretizes these representations into semantic IDs with a Residual-Quantized VAE (RQ-VAE), which yields semantic clusters of similar entities. AKT-Rec has two main components: (1) Cluster-Guided Adaptive Embedding, which decomposes each ID representation into a cluster-level embedding that captures shared semantics and an individual embedding. Through an asymmetric contrastive objective and an activity-aware gating mechanism, this module directs knowledge transfer from head to tail IDs. (2) Hierarchical Feature Aggregation, which builds parallel feature views and adaptively fuses them to optimize predictions for samples with varying activity levels. Extensive experiments on a large-scale industrial dataset and online A/B testing on the Alibaba Tmall platform demonstrate the effectiveness of AKT-Rec. AKT-Rec improves offline performance by 0.35% in AUC and 1.53% in GAUC, outperforming several competitive baselines. In online A/B testing, AKT-Rec achieves a 2.76% increase in CTR and a 3.47% increase in GMV, validating its utility in real-world production environments.

翻译：现实电商平台中的长尾推荐因严重的数据不平衡问题而极具挑战性。现有方法常难以将基于内容的多模态特征与协同信号有效结合，且许多方法忽略了头尾ID间知识迁移的一个重要非对称性：来自尾部ID的噪声信号会损害头部ID的表示学习。本文提出AKT-Rec——一种面向长尾推荐的非对称知识迁移框架，该框架利用大语言模型生成的语义ID。AKT-Rec采用多模态大语言模型（MLLMs）配合监督微调，将物品与用户的内容表征与协同信息对齐，生成语义表示；随后通过残差量化变分自编码器（RQ-VAE）将这些表示离散化为语义ID，形成相似实体的语义聚类。AKT-Rec包含两个核心组件：(1) 聚类引导自适应嵌入模块：将每个ID表示分解为捕获共享语义的聚类级嵌入与个体嵌入，通过非对称对比目标和活动感知门控机制引导知识从头至尾ID迁移；(2) 分层特征聚合模块：构建并行特征视图并自适应融合，优化不同活动水平样本的预测性能。基于大规模工业数据集和阿里巴巴天猫平台的在线A/B测试验证了AKT-Rec的有效性：离线性能提升AUC 0.35%、GAUC 1.53%，优于多个强力基线模型；在线A/B测试中CTR提升2.76%、GMV提升3.47%，验证了其在真实生产环境中的实用价值。