Collaborative filtering and graph-based recommendation models are highly effective because they leverage observed user interactions, but this dependence creates a fundamental cold-start challenge when newly added content has no interaction history. In Tubi's production retrieval system, this challenge is further constrained by the serving interface: new content must be assigned a standalone embedding immediately, and the model must also produce device embeddings suitable for approximate nearest-neighbor retrieval. We address this setting by formulating cold-start recommendation as an inductive graph-completion problem on a temporal bipartite device-content graph. We propose Shallow-RHS, an asymmetric link-prediction architecture in which the left-hand side (LHS) device tower leverages temporally valid watch-history message passing to capture collaborative signals, while the right-hand side (RHS) content tower is intentionally shallow with respect to the graph and encodes content solely from intrinsic features. The RHS tower does not use ID-based embeddings, content-side subgraphs, neighbor aggregation, or interaction-derived representations, forcing the content encoder to map intrinsic features into a collaborative-filtering-aware embedding space. After training, the learned content encoder generates embeddings for both warm and newly ingested content, enabling implicit graph completion through retrieval of warm surrogate neighbors. We further extend the same representation-completion principle to device cold-start by constructing cohort-based embeddings from demographic features. Large-scale online experiments demonstrate consistent relative improvements in content cold-start engagement, promotion speed, impression acquisition, and device cold-start engagement.
翻译:协同过滤与基于图的推荐模型因利用用户观测交互而表现优异,但这种依赖关系导致新加入内容无交互历史时面临根本性的冷启动挑战。在Tubi生产检索系统中,该挑战进一步受限于服务接口:新内容必须立即分配独立嵌入向量,且模型需生成适用于近似最近邻检索的设备嵌入。为解决该场景,我们将冷启动推荐形式化为时序二分设备-内容图上的归纳图补全问题,提出非对称链接预测架构Shallow-RHS:左侧设备塔通过时序有效的观看历史消息传递捕获协同信号,而右侧内容塔在图上刻意保持浅层结构,仅从内在特征编码内容。右侧塔不使用基于ID的嵌入、内容侧子图、邻域聚合或交互派生表征,迫使内容编码器将内在特征映射至协同过滤感知的嵌入空间。训练后,学习到的内容编码器可为热内容与新摄入内容生成嵌入,通过检索热替身近邻实现隐式图补全。我们进一步将同一表征补全原理扩展至设备冷启动场景,基于人口统计学特征构建队列嵌入。大规模在线实验证明,该方法在内容冷启动参与度、推广速度、曝光获取及设备冷启动参与度上持续取得相对提升。