How Graph Convolutions Amplify Popularity Bias for Recommendation?

Graph convolutional networks (GCNs) have become prevalent in recommender system (RS) due to their superiority in modeling collaborative patterns. Although improving the overall accuracy, GCNs unfortunately amplify popularity bias -- tail items are less likely to be recommended. This effect prevents the GCN-based RS from making precise and fair recommendations, decreasing the effectiveness of recommender systems in the long run. In this paper, we investigate how graph convolutions amplify the popularity bias in RS. Through theoretical analyses, we identify two fundamental factors: (1) with graph convolution (\textit{i.e.,} neighborhood aggregation), popular items exert larger influence than tail items on neighbor users, making the users move towards popular items in the representation space; (2) after multiple times of graph convolution, popular items would affect more high-order neighbors and become more influential. The two points make popular items get closer to almost users and thus being recommended more frequently. To rectify this, we propose to estimate the amplified effect of popular nodes on each node's representation, and intervene the effect after each graph convolution. Specifically, we adopt clustering to discover highly-influential nodes and estimate the amplification effect of each node, then remove the effect from the node embeddings at each graph convolution layer. Our method is simple and generic -- it can be used in the inference stage to correct existing models rather than training a new model from scratch, and can be applied to various GCN models. We demonstrate our method on two representative GCN backbones LightGCN and UltraGCN, verifying its ability in improving the recommendations of tail items without sacrificing the performance of popular items. Codes are open-sourced \footnote{https://github.com/MEICRS/DAP}.

翻译：图卷积网络（GCNs）因其在建模协同模式方面的优越性，已在推荐系统（RS）中广泛应用。尽管提升了整体准确性，但GCNs不幸地放大了流行度偏差——尾部物品被推荐的可能性更低。这一效应阻碍了基于GCN的推荐系统做出精准且公平的推荐，长期来看会降低推荐系统的有效性。本文探究了图卷积如何放大推荐系统中的流行度偏差。通过理论分析，我们识别出两个根本因素：（1）经图卷积（即邻域聚合）后，流行物品对邻居用户的影响力大于尾部物品，导致用户在表征空间中向流行物品移动；（2）多次图卷积后，流行物品会影响更多高阶邻居，其影响力进一步增强。这两点使得流行物品与几乎所有用户距离更近，从而被更频繁地推荐。为纠正这一偏差，我们提出估计流行节点对各节点表征的放大效应，并在每次图卷积后对其进行干预。具体而言，我们采用聚类方法发现高影响力节点并估计各节点的放大效应，然后在每个图卷积层中从节点嵌入中移除该效应。该方法简单且通用——既可用于推理阶段修正现有模型（无需重新训练），也可应用于多种GCN模型。我们基于两个代表性GCN主干网络LightGCN和UltraGCN验证了该方法，表明其能在不牺牲流行物品性能的前提下提升尾部物品的推荐效果。相关代码已开源\footnote{https://github.com/MEICRS/DAP}。