Despite the remarkable success of graph neural networks (GNNs) in modeling graph-structured data, like other machine learning models, GNNs are also susceptible to making biased predictions based on sensitive attributes, such as race and gender. For fairness consideration, recent state-of-the-art (SOTA) methods propose to filter out sensitive information from inputs or representations, e.g., edge dropping or feature masking. However, we argue that such filtering-based strategies may also filter out some non-sensitive feature information, leading to a sub-optimal trade-off between predictive performance and fairness. To address this issue, we unveil an innovative neutralization-based paradigm, where additional Fairness-facilitating Features (F3) are incorporated into node features or representations before message passing. The F3 are expected to statistically neutralize the sensitive bias in node representations and provide additional nonsensitive information. We also provide theoretical explanations for our rationale, concluding that F3 can be realized by emphasizing the features of each node's heterogeneous neighbors (neighbors with different sensitive attributes). We name our method as FairSIN, and present three implementation variants from both data-centric and model-centric perspectives. Experimental results on five benchmark datasets with three different GNN backbones show that FairSIN significantly improves fairness metrics while maintaining high prediction accuracies.
翻译:尽管图神经网络(GNNs)在建模图结构数据方面取得了显著成功,但与其他机器学习模型类似,GNNs也容易基于敏感属性(如种族和性别)做出有偏预测。出于公平性考虑,近期最先进的(SOTA)方法提出从输入或表示中过滤掉敏感信息,例如边丢弃或特征遮蔽。然而,我们认为这种基于过滤的策略也可能滤除部分非敏感特征信息,导致预测性能与公平性之间的次优权衡。为解决此问题,我们提出了一种创新的基于中和的范式,其中在消息传递之前将额外的促进公平性特征(F3)融入节点特征或表示中。F3旨在在统计上中和节点表示中的敏感偏差,并提供额外的非敏感信息。我们还为这一原理提供了理论解释,得出结论:F3可通过强调每个节点的异质邻居(具有不同敏感属性的邻居)的特征来实现。我们将该方法命名为FairSIN,并从数据驱动和模型驱动两个角度提出了三种实现变体。在五个基准数据集上使用三种不同GNN主干网络的实验结果表明,FairSIN在保持高预测准确率的同时显著提升了公平性指标。