Graph neural networks (GNNs) have demonstrated significant success in various applications, such as node classification, link prediction, and graph classification. Active learning for GNNs aims to query the valuable samples from the unlabeled data for annotation to maximize the GNNs' performance at a lower cost. However, most existing algorithms for reinforced active learning in GNNs may lead to a highly imbalanced class distribution, especially in highly skewed class scenarios. GNNs trained with class-imbalanced labeled data are susceptible to bias toward majority classes, and the lower performance of minority classes may lead to a decline in overall performance. To tackle this issue, we propose a novel class-balanced and reinforced active learning framework for GNNs, namely, GCBR. It learns an optimal policy to acquire class-balanced and informative nodes for annotation, maximizing the performance of GNNs trained with selected labeled nodes. GCBR designs class-balance-aware states, as well as a reward function that achieves trade-off between model performance and class balance. The reinforcement learning algorithm Advantage Actor-Critic (A2C) is employed to learn an optimal policy stably and efficiently. We further upgrade GCBR to GCBR++ by introducing a punishment mechanism to obtain a more class-balanced labeled set. Extensive experiments on multiple datasets demonstrate the effectiveness of the proposed approaches, achieving superior performance over state-of-the-art baselines.
翻译:图神经网络(GNNs)在节点分类、链接预测和图分类等应用中展现出显著成效。面向GNNs的主动学习旨在从无标注数据中查询高价值样本进行标注,以更低成本最大化GNN性能。然而,现有GNN强化主动学习算法可能导致高度不平衡的类别分布,尤其是在极端偏斜类别场景中。基于类别不平衡标注数据训练的GNN易对多数类产生偏差,而少数类性能下降将导致整体性能衰退。针对该问题,我们提出一种新型类别平衡与强化主动学习框架GCBR,用于学习获取类别平衡且信息丰富的待标注节点最优策略,从而最大化基于所选标注节点训练的GNN性能。GCBR设计了类别平衡感知状态及实现模型性能与类别平衡权衡的奖励函数,采用强化学习算法优势演员-评论家(A2C)稳定高效地学习最优策略。我们进一步通过引入惩罚机制将GCBR升级为GCBR++,以获得更类别平衡的标注集。多数据集上的广泛实验证明了所提方法的有效性,在性能上超越了现有最优基线方法。