Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

Marco De Nadai,Francesco Fabbri,Paul Gigioli,Alice Wang,Ang Li,Fabrizio Silvestri,Laura Kim,Shawn Lin,Vladan Radosavljevic,Sandeep Ghael,David Nyhan,Hugues Bouchard,Mounia Lalmas-Roelleke,Andreas Damianou

from arxiv, To appear in The Web Conference 2024 proceedings

In the ever-evolving digital audio landscape, Spotify, well-known for its music and talk content, has recently introduced audiobooks to its vast user base. While promising, this move presents significant challenges for personalized recommendations. Unlike music and podcasts, audiobooks, initially available for a fee, cannot be easily skimmed before purchase, posing higher stakes for the relevance of recommendations. Furthermore, introducing a new content type into an existing platform confronts extreme data sparsity, as most users are unfamiliar with this new content type. Lastly, recommending content to millions of users requires the model to react fast and be scalable. To address these challenges, we leverage podcast and music user preferences and introduce 2T-HGNN, a scalable recommendation system comprising Heterogeneous Graph Neural Networks (HGNNs) and a Two Tower (2T) model. This novel approach uncovers nuanced item relationships while ensuring low latency and complexity. We decouple users from the HGNN graph and propose an innovative multi-link neighbor sampler. These choices, together with the 2T component, significantly reduce the complexity of the HGNN model. Empirical evaluations involving millions of users show significant improvement in the quality of personalized recommendations, resulting in a +46% increase in new audiobooks start rate and a +23% boost in streaming rates. Intriguingly, our model's impact extends beyond audiobooks, benefiting established products like podcasts.

翻译：在不断演变的数字音频领域，以音乐和谈话类内容闻名的Spotify最近向庞大用户群体推出了有声书服务。这一举措虽前景广阔，却为个性化推荐带来重大挑战。与音乐和播客不同，有声书最初为付费内容，用户在购买前难以快速浏览，因此对推荐内容的相关性要求更高。此外，在现有平台引入新内容类型面临极端的数据稀疏性问题——大多数用户对此类新内容并不熟悉。最后，向数百万用户推荐内容要求模型既快速响应又具备可扩展性。为应对这些挑战，我们利用播客和音乐用户偏好，提出2T-HGNN——一种融合异构图神经网络（HGNN）与双塔模型（2T）的可扩展推荐系统。这一创新方法在确保低延迟和低复杂度的同时，揭示了细粒度的项目关系。我们将用户与HGNN图解耦，并提出创新的多链接邻居采样器。这些设计选择结合2T组件，显著降低了HGNN模型的复杂度。涉及数百万用户的实际评估显示，个性化推荐质量显著提升：有声书新启播率提高46%，流媒体播放率增长23%。有趣的是，该模型的影响超越了有声书领域，对播客等成熟产品亦产生积极效果。