In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interaction histories with both videos and comments, so as to jointly conduct personalized video and comment recommendation. Specifically, our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender. The SR model serves as the primary recommendation backbone (retained in deployment) of our approach, allowing for efficient user preference modeling. Meanwhile, we leverage the LLM recommender as a supplemental component (discarded in deployment) to better capture underlying user preferences from heterogeneous interaction behaviors. In order to integrate the merits of the SR model and the supplemental LLM recommender, we design a twostage training paradigm. The first stage is personalized preference alignment, which aims to align the preference representations from both components, thereby enhancing the semantics of the SR model. The second stage is recommendation-oriented fine-tuning, in which the alignment-enhanced SR model is fine-tuned according to specific objectives. Extensive experiments in both video and comment recommendation tasks demonstrate the effectiveness of LSVCR. Additionally, online A/B testing on the KuaiShou platform verifies the actual benefits brought by our approach. In particular, we achieve a significant overall gain of 4.13% in comment watch time.
翻译:在在线视频平台中,观看趣味视频的同时阅读或撰写评论已成为视频观看体验的重要组成部分。然而,现有视频推荐系统主要建模用户与视频的交互行为,缺乏对评论行为的建模考量。本文提出一种名为LSVCR的新型推荐方法,通过利用用户与视频及评论的交互历史,联合实现个性化视频与评论推荐。具体而言,该方法包含两大核心组件:序列推荐(SR)模型与辅助性大语言模型(LLM)推荐器。其中SR模型作为推荐主骨架(部署阶段保留),可高效建模用户偏好;同时我们引入LLM推荐器作为辅助组件(部署阶段舍弃),以从异构交互行为中更精准捕捉潜在用户偏好。为融合SR模型与辅助性LLM推荐器的优势,我们设计了两阶段训练范式:第一阶段为个性化偏好对齐,旨在对齐两组件生成的偏好表征,从而增强SR模型的语义表达能力;第二阶段为推荐导向微调,将对齐增强后的SR模型根据特定目标进行微调。在视频与评论推荐任务上的大量实验验证了LSVCR的有效性。此外,在快手平台开展的在线A/B测试进一步证实了该方法带来的实际效益,其中评论观看时长获得4.13%的显著整体提升。