Conventional Sequential Recommender Systems (SRS) typically assign unique hash IDs (HID) to construct item embeddings, which mainly capture collaborative signals from historical user-item interactions. However, such embeddings are vulnerable in long-tail scenarios where most items are rarely consumed. Recent methods that incorporate auxiliary information often face noisy collaborative sharing from co-occurrence signals or semantic homogeneity caused by flat dense embeddings. In contrast, Semantic IDs (SID), with their support for code sharing and multi-granular semantic modeling, offer a promising alternative. Nevertheless, SID-based methods are hindered by a collaborative overwhelming phenomenon: commonly adopted quantization mechanisms compromise the identifier uniqueness needed to model head items, resulting in a performance trade-off between head and tail items. To address this challenge, we propose H2Rec, a novel framework that harmonizes SID and HID. We design a dual-branch modeling architecture that simultaneously captures the multi-granular semantics of SID while preserving the unique collaborative identity provided by HID. Moreover, we introduce a dual-level alignment strategy to bridge the two representations, enabling effective knowledge transfer and robust preference modeling. Extensive offline experiments on three public benchmarks and online experiments on a large-scale commercial platform demonstrate that H2Rec achieves a better balance between head and tail recommendation quality and consistently outperforms existing baselines.
翻译:传统序列推荐系统通常分配唯一哈希ID(HID)来构建物品嵌入,主要从历史用户-物品交互中捕捉协同信号。然而,在长尾场景中(大多数物品很少被消费),此类嵌入易受影响。近期融入辅助信息的方法常面临共现信号导致的噪声协同共享,或扁平稠密嵌入引发的语义同质性问题。相比之下,语义ID(SID)凭借其代码共享与多粒度语义建模能力,展现出有前景的替代方案。但SID方法受到协同淹没现象的制约:常用量化机制损害了建模头部物品所需的标识符唯一性,导致头尾物品性能的权衡。为解决该挑战,我们提出H2Rec——融合SID与HID的新型框架。我们设计双分支建模架构,同时捕捉SID的多粒度语义并保留HID提供的独特协同标识。此外,引入双层对齐策略桥接两种表征,实现有效知识迁移与鲁棒偏好建模。在三个公开基准集上的离线实验及大规模商业平台在线实验表明,H2Rec在头尾物品推荐质量间取得更优平衡,且持续超越现有基线方法。