Learning unbiased node representations for imbalanced samples in the graph has become a more remarkable and important topic. For the graph, a significant challenge is that the topological properties of the nodes (e.g., locations, roles) are unbalanced (topology-imbalance), other than the number of training labeled nodes (quantity-imbalance). Existing studies on topology-imbalance focus on the location or the local neighborhood structure of nodes, ignoring the global underlying hierarchical properties of the graph, i.e., hierarchy. In the real-world scenario, the hierarchical structure of graph data reveals important topological properties of graphs and is relevant to a wide range of applications. We find that training labeled nodes with different hierarchical properties have a significant impact on the node classification tasks and confirm it in our experiments. It is well known that hyperbolic geometry has a unique advantage in representing the hierarchical structure of graphs. Therefore, we attempt to explore the hierarchy-imbalance issue for node classification of graph neural networks with a novelty perspective of hyperbolic geometry, including its characteristics and causes. Then, we propose a novel hyperbolic geometric hierarchy-imbalance learning framework, named HyperIMBA, to alleviate the hierarchy-imbalance issue caused by uneven hierarchy-levels and cross-hierarchy connectivity patterns of labeled nodes.Extensive experimental results demonstrate the superior effectiveness of HyperIMBA for hierarchy-imbalance node classification tasks.
翻译:在图数据中为不平衡样本学习无偏节点表征已成为一个日益显著且重要的课题。与训练标注节点的数量不平衡(数量不平衡)不同,图数据面临的一个关键挑战在于节点拓扑属性(如位置、角色)的不平衡(拓扑不平衡)。现有针对拓扑不平衡的研究主要关注节点位置或局部邻域结构,却忽略了图数据全局潜在的层级属性,即层级结构。在现实场景中,图数据的层级结构揭示了图的重要拓扑性质,并与广泛的应用领域相关。我们发现,具有不同层级属性的训练标注节点对节点分类任务产生显著影响,并在实验中验证了这一结论。众所周知,双曲几何在表征图的层级结构方面具有独特优势。因此,我们尝试从双曲几何这一崭新视角出发,探究图神经网络节点分类中的层级不平衡问题,包括其特征与成因。随后,我们提出了一种新颖的双曲几何层级不平衡学习框架HyperIMBA,旨在缓解因标注节点层级分布不均及跨层级连接模式差异所导致的层级不平衡问题。大量实验结果表明,HyperIMBA在层级不平衡节点分类任务中具有显著的有效性。