Graph neural networks (GNNs) have recently demonstrated significant success. Active learning for GNNs aims to query the valuable samples from the unlabeled data for annotation to maximize the GNNs' performance at a low cost. However, most existing methods for reinforced active learning in GNNs may lead to a highly imbalanced class distribution, especially in highly skewed class scenarios. This further adversely affects the classification performance. To tackle this issue, in this paper, we propose a novel reinforced class-balanced active learning framework for GNNs, namely, GraphCBAL. It learns an optimal policy to acquire class-balanced and informative nodes for annotation, maximizing the performance of GNNs trained with selected labeled nodes. GraphCBAL designs class-balance-aware states, as well as a reward function that achieves trade-off between model performance and class balance. We further upgrade GraphCBAL to GraphCBAL++ by introducing a punishment mechanism to obtain a more class-balanced labeled set. Extensive experiments on multiple datasets demonstrate the effectiveness of the proposed approaches, achieving superior performance over state-of-the-art baselines. In particular, our methods can strike the balance between classification results and class balance.
翻译:图神经网络(GNNs)近年来取得了显著成功。面向GNNs的主动学习旨在从未标注数据中查询有价值样本进行标注,以低成本最大化GNNs性能。然而,现有GNNs强化主动学习方法在类别高度倾斜的场景下,易导致严重的类别分布不平衡问题,进而影响分类性能。为解决该问题,本文提出一种新型的GNNs强化类别平衡主动学习框架GraphCBAL。该框架学习最优策略以获取类别平衡且信息量丰富的节点进行标注,从而最大化基于选定标注节点训练的GNNs性能。GraphCBAL设计了类别平衡感知状态及兼顾模型性能与类别平衡的奖励函数,并通过引入惩罚机制进一步升级为GraphCBAL++,以获得更平衡的标注集。在多个数据集上的大量实验证明了所提方法的有效性,其性能优于现有最先进基线方法。特别地,本方法能够在分类结果与类别平衡之间实现良好权衡。