Graph classification, aiming at learning the graph-level representations for effective class assignments, has received outstanding achievements, which heavily relies on high-quality datasets that have balanced class distribution. In fact, most real-world graph data naturally presents a long-tailed form, where the head classes occupy much more samples than the tail classes, it thus is essential to study the graph-level classification over long-tailed data while still remaining largely unexplored. However, most existing long-tailed learning methods in visions fail to jointly optimize the representation learning and classifier training, as well as neglect the mining of the hard-to-classify classes. Directly applying existing methods to graphs may lead to sub-optimal performance, since the model trained on graphs would be more sensitive to the long-tailed distribution due to the complex topological characteristics. Hence, in this paper, we propose a novel long-tailed graph-level classification framework via Collaborative Multi-expert Learning (CoMe) to tackle the problem. To equilibrate the contributions of head and tail classes, we first develop balanced contrastive learning from the view of representation learning, and then design an individual-expert classifier training based on hard class mining. In addition, we execute gated fusion and disentangled knowledge distillation among the multiple experts to promote the collaboration in a multi-expert framework. Comprehensive experiments are performed on seven widely-used benchmark datasets to demonstrate the superiority of our method CoMe over state-of-the-art baselines.
翻译:图分类旨在学习图级表示以实现有效的类别分配,这一任务已取得显著进展,但其高度依赖于具有平衡类别分布的高质量数据集。然而,现实世界中的图数据大多呈现长尾分布,其中头部类别样本数量远多于尾部类别,因此研究长尾数据上的图级分类至关重要,但这一领域仍鲜有探索。现有计算机视觉领域的长尾学习方法难以同时优化表示学习与分类器训练,且忽视了难分类类别的挖掘。若直接将现有方法应用于图数据,由于图数据复杂的拓扑特性,模型对长尾分布将更加敏感,可能导致次优性能。为此,本文提出一种基于协同多专家学习(CoMe)的新型长尾图级分类框架。首先,从表示学习角度出发,我们设计了平衡对比学习以均衡头部与尾部类别的贡献;其次,基于难类别挖掘策略构建了个体专家分类器训练方法。此外,在多专家框架中引入门控融合与解耦知识蒸馏机制以增强专家间的协同效应。在七个广泛使用的基准数据集上的综合实验表明,本文提出的CoMe方法性能优于现有最优基线模型。