The proliferation of online micro-video platforms has underscored the necessity for advanced recommender systems to mitigate information overload and deliver tailored content. Despite advancements, accurately and promptly capturing dynamic user interests remains a formidable challenge. Inspired by the Platonic Representation Hypothesis, which posits that different data modalities converge towards a shared statistical model of reality, we introduce DreamUMM (Dreaming User Multi-Modal Representation), a novel approach leveraging user historical behaviors to create real-time user representation in a multimoda space. DreamUMM employs a closed-form solution correlating user video preferences with multimodal similarity, hypothesizing that user interests can be effectively represented in a unified multimodal space. Additionally, we propose Candidate-DreamUMM for scenarios lacking recent user behavior data, inferring interests from candidate videos alone. Extensive online A/B tests demonstrate significant improvements in user engagement metrics, including active days and play count. The successful deployment of DreamUMM in two micro-video platforms with hundreds of millions of daily active users, illustrates its practical efficacy and scalability in personalized micro-video content delivery. Our work contributes to the ongoing exploration of representational convergence by providing empirical evidence supporting the potential for user interest representations to reside in a multimodal space.
翻译:在线微视频平台的激增凸显了先进推荐系统在缓解信息过载和提供个性化内容方面的必要性。尽管技术不断进步,准确且及时地捕捉动态用户兴趣仍然是一项艰巨的挑战。受柏拉图表征假设的启发——该假设认为不同数据模态会收敛于一个共享的现实统计模型——我们提出了DreamUMM(Dreaming User Multi-Modal Representation),一种利用用户历史行为在多模态空间中创建实时用户表征的新方法。DreamUMM采用一种闭式解,将用户视频偏好与多模态相似性相关联,其假设是用户兴趣可以在一个统一的多模态空间中得到有效表征。此外,我们针对缺乏近期用户行为数据的场景提出了Candidate-DreamUMM,仅从候选视频推断用户兴趣。大量的在线A/B测试表明,该方法在用户参与度指标(包括活跃天数和播放量)上取得了显著提升。DreamUMM已在两个拥有数亿日活跃用户的微视频平台成功部署,这证明了其在个性化微视频内容分发方面的实际效能和可扩展性。我们的工作通过提供支持用户兴趣表征可存在于多模态空间这一潜力的实证证据,为表征收敛的持续探索做出了贡献。