In this paper, we investigate the conditions under which link analysis algorithms prevent minority groups from reaching high ranking slots. We find that the most common link-based algorithms using centrality metrics, such as PageRank and HITS, can reproduce and even amplify bias against minority groups in networks. Yet, their behavior differs: one one hand, we empirically show that PageRank mirrors the degree distribution for most of the ranking positions and it can equalize representation of minorities among the top ranked nodes; on the other hand, we find that HITS amplifies pre-existing bias in homophilic networks through a novel theoretical analysis, supported by empirical results. We find the root cause of bias amplification in HITS to be the level of homophily present in the network, modeled through an evolving network model with two communities. We illustrate our theoretical analysis on both synthetic and real datasets and we present directions for future work.
翻译:本文探讨了链接分析算法阻止少数群体获得高排名位置的条件。我们发现,最常用的基于中心性指标的链接算法(如PageRank和HITS)可能重现甚至放大网络中对少数群体的偏见。然而,两者行为存在差异:一方面,我们通过实证表明PageRank在大部分排名位置上反映了度分布,并能平衡少数群体在顶级节点中的代表性;另一方面,我们通过创新性的理论分析(辅以实证结果)发现,HITS会放大同质网络中已有的偏见。我们利用包含两个社区演化的网络模型,将HITS中偏见放大的根本原因归结为网络中的同质性水平。我们通过合成数据集与真实数据集验证了理论分析,并提出了未来研究方向。