The significance of modeling long-term user interests for CTR prediction tasks in large-scale recommendation systems is progressively gaining attention among researchers and practitioners. Existing work, such as SIM and TWIN, typically employs a two-stage approach to model long-term user behavior sequences for efficiency concerns. The first stage rapidly retrieves a subset of sequences related to the target item from a long sequence using a search-based mechanism namely the General Search Unit (GSU), while the second stage calculates the interest scores using the Exact Search Unit (ESU) on the retrieved results. Given the extensive length of user behavior sequences spanning the entire life cycle, potentially reaching up to 10^6 in scale, there is currently no effective solution for fully modeling such expansive user interests. To overcome this issue, we introduced TWIN-V2, an enhancement of TWIN, where a divide-and-conquer approach is applied to compress life-cycle behaviors and uncover more accurate and diverse user interests. Specifically, a hierarchical clustering method groups items with similar characteristics in life-cycle behaviors into a single cluster during the offline phase. By limiting the size of clusters, we can compress behavior sequences well beyond the magnitude of 10^5 to a length manageable for online inference in GSU retrieval. Cluster-aware target attention extracts comprehensive and multi-faceted long-term interests of users, thereby making the final recommendation results more accurate and diverse. Extensive offline experiments on a multi-billion-scale industrial dataset and online A/B tests have demonstrated the effectiveness of TWIN-V2. Under an efficient deployment framework, TWIN-V2 has been successfully deployed to the primary traffic that serves hundreds of millions of daily active users at Kuaishou.
翻译:在大规模推荐系统中,建模长期用户兴趣对于点击率预测任务的重要性正日益受到研究者和从业者的关注。现有工作,例如SIM和TWIN,通常出于效率考虑采用两阶段方法来建模长期用户行为序列。第一阶段通过基于搜索的机制,即通用搜索单元,从长序列中快速检索出与目标物品相关的子序列;第二阶段则使用精确搜索单元在检索结果上计算兴趣得分。鉴于用户行为序列覆盖整个生命周期,长度极大,规模可能高达10^6,目前尚无有效方案能对此类广泛的用户兴趣进行完整建模。为克服此问题,我们提出了TWIN的增强版本TWIN-V2,其中采用分治法压缩生命周期行为,以挖掘更准确、更多样的用户兴趣。具体而言,在离线阶段,一种层次聚类方法将生命周期行为中具有相似特征的物品分组到单个簇中。通过限制簇的规模,我们可以将行为序列从远超10^5的量级压缩至GSU检索中在线推理可处理的长度。簇感知目标注意力机制提取用户全面且多方面的长期兴趣,从而使最终推荐结果更加准确和多样。在数十亿规模的工业数据集上进行的大量离线实验以及在线A/B测试,均证明了TWIN-V2的有效性。在高效的部署框架下,TWIN-V2已成功部署至快手服务数亿日活跃用户的主流量中。