Graph Neural Networks (GNNs) are well-suited for learning on homophilous graphs, i.e., graphs in which edges tend to connect nodes of the same type. Yet, achievement of consistent GNN performance on heterophilous graphs remains an open research problem. Recent works have proposed extensions to standard GNN architectures to improve performance on heterophilous graphs, trading off model simplicity for prediction accuracy. However, these models fail to capture basic graph properties, such as neighborhood label distribution, which are fundamental for learning. In this work, we propose GCN for Heterophily (GCNH), a simple yet effective GNN architecture applicable to both heterophilous and homophilous scenarios. GCNH learns and combines separate representations for a node and its neighbors, using one learned importance coefficient per layer to balance the contributions of center nodes and neighborhoods. We conduct extensive experiments on eight real-world graphs and a set of synthetic graphs with varying degrees of heterophily to demonstrate how the design choices for GCNH lead to a sizable improvement over a vanilla GCN. Moreover, GCNH outperforms state-of-the-art models of much higher complexity on four out of eight benchmarks, while producing comparable results on the remaining datasets. Finally, we discuss and analyze the lower complexity of GCNH, which results in fewer trainable parameters and faster training times than other methods, and show how GCNH mitigates the oversmoothing problem.
翻译:图神经网络(GNN)非常适用于同质图上的学习,即边倾向于连接相同类型节点的图。然而,在异质图上实现一致的GNN性能仍是一个开放的研究问题。近年来的研究提出了对标准GNN架构的扩展,以提升异质图上的性能,但这是以牺牲模型简洁性换取预测精度为代价的。然而,这些模型未能捕捉基本的图属性,例如邻域标签分布,而这对学习至关重要。本文中,我们提出了面向异质性的GCN(GCNH),这是一种简单而有效的GNN架构,适用于异质和同质两种场景。GCNH学习并合并节点及其邻居的独立表示,每层使用一个学习到的重要性系数来平衡中心节点与邻域的贡献。我们在八个真实世界图以及一组具有不同异质程度的合成图上进行了广泛的实验,以证明GCNH的设计选择如何带来比原始GCN更显著的改进。此外,在八个基准数据集中的四个上,GCNH性能优于复杂度更高的现有最优模型,并在其余数据集上取得了可比较的结果。最后,我们讨论并分析了GCNH较低的复杂度,这使其相比其他方法拥有更少的可训练参数和更快的训练时间,并展示了GCNH如何缓解过平滑问题。