Graphs represent interconnected structures prevalent in a myriad of real-world scenarios. Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data, underpinning various tasks including node classification and link prediction. However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce, thereby leading to biased learning outcomes. This necessitates the emerging field of imbalanced learning on graphs, which aims to correct these data distribution skews for more accurate and representative learning outcomes. In this survey, we embark on a comprehensive review of the literature on imbalanced learning on graphs. We begin by providing a definitive understanding of the concept and related terminologies, establishing a strong foundational understanding for readers. Following this, we propose two comprehensive taxonomies: (1) the problem taxonomy, which describes the forms of imbalance we consider, the associated tasks, and potential solutions; (2) the technique taxonomy, which details key strategies for addressing these imbalances, and aids readers in their method selection process. Finally, we suggest prospective future directions for both problems and techniques within the sphere of imbalanced learning on graphs, fostering further innovation in this critical area.
翻译:图结构广泛存在于现实世界的诸多场景中。有效的图分析方法,如图学习方法,使用户能够从图数据中获取深刻洞见,支撑节点分类、链路预测等各类任务。然而,这些方法常受数据不平衡问题困扰——图数据中某些部分数据丰富而其他部分数据稀缺,从而导致有偏的学习结果。新兴的图上不平衡学习领域应运而生,旨在纠正数据分布偏差,实现更准确且更具代表性的学习效果。本综述对图上不平衡学习的相关文献进行全面回顾。我们首先明确界定该概念及相关术语,为读者建立扎实的理解基础。随后,我们提出两个综合性的分类体系:(1)问题分类体系,描述所考虑的不平衡形式、关联任务及潜在解决方案;(2)技术分类体系,详述解决不平衡问题的关键策略,并帮助读者进行方法选择。最后,我们提出图上不平衡学习领域中问题与技术的未来研究方向,以推动这一关键领域的进一步创新。