Graph neural networks (GNNs) have been proved powerful in graph-oriented tasks. However, many real-world graphs are heterophilous, challenging the homophily assumption of classical GNNs. To solve the universality problem, many studies deepen networks or concatenate intermediate representations, which does not inherently change neighbor aggregation and introduces noise. Recent studies propose new metrics to characterize the homophily, but rarely consider the correlation of the proposed metrics and models. In this paper, we first design a new metric, Neighborhood Homophily (\textit{NH}), to measure the label complexity or purity in node neighborhoods. Furthermore, we incorporate the metric into the classical graph convolutional network (GCN) architecture and propose \textbf{N}eighborhood \textbf{H}omophily-based \textbf{G}raph \textbf{C}onvolutional \textbf{N}etwork (\textbf{NHGCN}). In this framework, neighbors are grouped by estimated \textit{NH} values and aggregated from different channels, and the resulting node predictions are then used in turn to estimate and update \textit{NH} values. The two processes of metric estimation and model inference are alternately optimized to achieve better node classification. NHGCN achieves top overall performance on both homophilous and heterophilous benchmarks, with an improvement of up to 7.4\% compared to the current SOTA methods.
翻译:图神经网络(GNN)在图相关任务中已被证明是强大的工具。然而,许多真实世界的图具有异质性,这挑战了经典GNN的同质性假设。为了解决通用性问题,许多研究通过加深网络或拼接中间表示,但这并未从根本上改变邻居聚合方式,反而引入了噪声。近期研究提出了新指标来刻画同质性,但很少考虑所提指标与模型之间的相关性。本文首先设计了一个新指标——邻域同质性(\textit{NH}),用于衡量节点邻域中的标签复杂度或纯度。进一步,我们将该指标融入经典图卷积网络(GCN)架构,提出\textbf{邻域同质性图卷积网络}(\textbf{NHGCN})。在该框架中,邻居根据估计的\textit{NH}值进行分组,并从不同通道聚合,由此产生的节点预测结果又被用于估计和更新\textit{NH}值。指标估计与模型推理这两个过程交替优化,以实现更优的节点分类。NHGCN在同质性和异质性基准测试中均取得了顶尖的整体性能,相比当前最优方法(SOTA)提升高达7.4%。