The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs

Graph Neural Networks (GNNs) have become pivotal tools for a range of graph-based learning tasks. Notably, most current GNN architectures operate under the assumption of homophily, whether explicitly or implicitly. While this underlying assumption is frequently adopted, it is not universally applicable, which can result in potential shortcomings in learning effectiveness. In this paper, \textbf{for the first time}, we transfer the prevailing concept of ``one node one receptive field" to the heterophilic graph. By constructing a proxy label predictor, we enable each node to possess a latent prediction distribution, which assists connected nodes in determining whether they should aggregate their associated neighbors. Ultimately, every node can have its own unique aggregation hop and pattern, much like each snowflake is unique and possesses its own characteristics. Based on observations, we innovatively introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs and beyond. We conduct comprehensive experiments including (1) main results on 10 graphs with varying heterophily ratios across 10 backbones; (2) scalability on various deep GNN backbones (SGC, JKNet, etc.) across various large number of layers (2,4,6,8,16,32 layers); (3) comparison with conventional snowflake hypothesis; (4) efficiency comparison with existing graph pruning algorithms. Our observations show that our framework acts as a versatile operator for diverse tasks. It can be integrated into various GNN frameworks, boosting performance in-depth and offering an explainable approach to choosing the optimal network depth. The source code is available at \url{https://github.com/bingreeky/HeteroSnoH}.

翻译：图神经网络已成为各类基于图的学习任务的关键工具。值得注意的是，当前大多数图神经网络架构（无论是显式还是隐式地）均基于同质性假设进行设计。尽管这一基础假设被广泛采用，但其并非普遍适用，可能导致学习效果存在潜在缺陷。本文**首次**将“单节点单感受野”的主流概念迁移至异质图场景。通过构建一个代理标签预测器，我们使每个节点能够获得一个潜在的预测分布，该分布可辅助相连节点判断是否应聚合其关联邻居。最终，每个节点均可拥有其独特的聚合跳数与模式，正如每片雪花都具有独特性并拥有自身特征。基于观测结果，我们创新性地提出异质图雪假说，并为指导与促进异质图及其他相关领域的研究提供了有效解决方案。我们进行了全面的实验，包括：（1）在10种不同异质比率的图上对10种骨干网络的主要结果；（2）在不同深度图神经网络骨干（SGC、JKNet等）上对多种深层架构（2、4、6、8、16、32层）的可扩展性测试；（3）与传统雪假说的对比；（4）与现有图剪枝算法的效率比较。实验结果表明，我们的框架可作为适用于多样化任务的通用算子。该框架能够集成至多种图神经网络架构中，在提升深层网络性能的同时，为选择最优网络深度提供了可解释的方法。源代码公开于\url{https://github.com/bingreeky/HeteroSnoH}。