Under circumstances of heterophily, where nodes with different labels tend to be connected based on semantic meanings, Graph Neural Networks (GNNs) often exhibit suboptimal performance. Current studies on graph heterophily mainly focus on aggregation calibration or neighbor extension and address the heterophily issue by utilizing node features or structural information to improve GNN representations. In this paper, we propose and demonstrate that the valuable semantic information inherent in heterophily can be utilized effectively in graph learning by investigating the distribution of neighbors for each individual node within the graph. The theoretical analysis is carried out to demonstrate the efficacy of the idea in enhancing graph learning. Based on this analysis, we propose HiGNN, an innovative approach that constructs an additional new graph structure, that integrates heterophilous information by leveraging node distribution to enhance connectivity between nodes that share similar semantic characteristics. We conduct empirical assessments on node classification tasks using both homophilous and heterophilous benchmark datasets and compare HiGNN to popular GNN baselines and SoTA methods, confirming the effectiveness in improving graph representations. In addition, by incorporating heterophilous information, we demonstrate a notable enhancement in existing GNN-based approaches, and the homophily degree across real-world datasets, thus affirming the efficacy of our approach.
翻译:在异质性情境下,即具有不同标签的节点倾向于基于语义含义相互连接时,图神经网络(GNNs)通常表现出次优性能。当前关于图异质性的研究主要集中于聚合校准或邻居扩展,并通过利用节点特征或结构信息来改进GNN表示以解决异质性问题。本文提出并证明,通过研究图中每个节点的邻居分布,可以有效利用异质性中固有的宝贵语义信息进行图学习。我们进行了理论分析,以证明该思想在增强图学习方面的有效性。基于此分析,我们提出了HiGNN,这是一种创新方法,通过利用节点分布来增强具有相似语义特征的节点之间的连接性,从而构建一个整合了异质性信息的额外新图结构。我们在同质性和异质性基准数据集上对节点分类任务进行了实证评估,并将HiGNN与流行的GNN基线方法和最先进方法进行比较,证实了其在改进图表示方面的有效性。此外,通过整合异质性信息,我们展示了现有基于GNN的方法以及真实世界数据集中的同质性程度均得到显著提升,从而验证了我们方法的有效性。