Promoting Fairness in GNNs: A Characterization of Stability

The Lipschitz bound, a technique from robust statistics, can limit the maximum changes in the output concerning the input, taking into account associated irrelevant biased factors. It is an efficient and provable method for examining the output stability of machine learning models without incurring additional computation costs. Recently, Graph Neural Networks (GNNs), which operate on non-Euclidean data, have gained significant attention. However, no previous research has investigated the GNN Lipschitz bounds to shed light on stabilizing model outputs, especially when working on non-Euclidean data with inherent biases. Given the inherent biases in common graph data used for GNN training, it poses a serious challenge to constraining the GNN output perturbations induced by input biases, thereby safeguarding fairness during training. Recently, despite the Lipschitz constant's use in controlling the stability of Euclideanneural networks, the calculation of the precise Lipschitz constant remains elusive for non-Euclidean neural networks like GNNs, especially within fairness contexts. To narrow this gap, we begin with the general GNNs operating on an attributed graph, and formulate a Lipschitz bound to limit the changes in the output regarding biases associated with the input. Additionally, we theoretically analyze how the Lipschitz constant of a GNN model could constrain the output perturbations induced by biases learned from data for fairness training. We experimentally validate the Lipschitz bound's effectiveness in limiting biases of the model output. Finally, from a training dynamics perspective, we demonstrate why the theoretical Lipschitz bound can effectively guide the GNN training to better trade-off between accuracy and fairness.

翻译：Lipschitz界作为鲁棒统计中的一种技术，可限制输出相对于输入的最大变化量，并考虑与之相关的无关偏差因素。这是一种高效且可验证的方法，无需额外计算成本即可检验机器学习模型的输出稳定性。近年来，针对非欧几里得数据操作的图神经网络（GNN）备受关注。然而，尚无前人研究探究GNN的Lipschitz界以阐明模型输出稳定性的机制，特别是当处理具有固有偏差的非欧几里得数据时。鉴于常用于GNN训练的图数据存在固有偏差，这给约束由输入偏差引起的GNN输出扰动带来了严峻挑战，从而在训练过程中保障公平性。尽管近期Lipschitz常数已被用于控制欧几里得神经网络的稳定性，但对于GNN等非欧几里得神经网络而言，精确Lipschitz常数的计算仍然难以实现，尤其在公平性场景下。为缩小这一差距，我们从作用于属性图的通用GNN出发，构建了一个Lipschitz界来限制与输入偏差相关的输出变化。此外，我们从理论上分析了GNN模型的Lipschitz常数如何约束由数据学习到的偏差所引发的输出扰动，从而服务于公平性训练。我们通过实验验证了Lipschitz界在限制模型输出偏差方面的有效性。最后，从训练动力学视角，我们论证了理论Lipschitz界为何能有效引导GNN训练在准确性与公平性之间实现更优权衡。