Predicting Click-Through Rate (CTR) in billion-scale recommender systems poses a long-standing challenge for Graph Neural Networks (GNNs) due to the overwhelming computational complexity involved in aggregating billions of neighbors. To tackle this, GNN-based CTR models usually sample hundreds of neighbors out of the billions to facilitate efficient online recommendations. However, sampling only a small portion of neighbors results in a severe sampling bias and the failure to encompass the full spectrum of user or item behavioral patterns. To address this challenge, we name the conventional user-item recommendation graph as "micro recommendation graph" and introduce a more suitable MAcro Recommendation Graph (MAG) for billion-scale recommendations. MAG resolves the computational complexity problems in the infrastructure by reducing the node count from billions to hundreds. Specifically, MAG groups micro nodes (users and items) with similar behavior patterns to form macro nodes. Subsequently, we introduce tailored Macro Graph Neural Networks (MacGNN) to aggregate information on a macro level and revise the embeddings of macro nodes. MacGNN has already served Taobao's homepage feed for two months, providing recommendations for over one billion users. Extensive offline experiments on three public benchmark datasets and an industrial dataset present that MacGNN significantly outperforms twelve CTR baselines while remaining computationally efficient. Besides, online A/B tests confirm MacGNN's superiority in billion-scale recommender systems.
翻译:在十亿级推荐系统中预测点击率(CTR)对图神经网络(GNN)而言是一项长期挑战,因为聚合数十亿邻居节点会带来巨大的计算复杂度。为解决这一问题,基于GNN的CTR模型通常从数十亿邻居中采样数百个节点,以支持高效的在线推荐。然而,仅采样少量邻居会导致严重的采样偏差,并无法涵盖用户或商品行为模式的完整谱系。针对这一挑战,我们将传统的用户-商品推荐图命名为"微观推荐图",并提出一种更适用于十亿级推荐的宏观推荐图(MAG)。MAG通过将节点规模从数十亿缩减至数百个,从根本上解决了基础设施中的计算复杂度问题。具体而言,MAG将具有相似行为模式的微观节点(用户和商品)聚类为宏观节点。随后,我们提出定制的宏观图神经网络(MacGNN),在宏观层面聚合信息并修正宏观节点的嵌入表示。MacGNN已在淘宝首页信息流中运行两个月,为超过十亿用户提供推荐服务。在三个公开基准数据集和工业数据集上的大量离线实验表明,MacGNN在保持计算高效性的同时,显著优于十二种CTR基线模型。此外,在线A/B测试证实了MacGNN在十亿级推荐系统中的优越性能。