Graph neural networks (GNNs) have demonstrated significant success in various applications, such as node classification, link prediction, and graph classification. Active learning for GNNs aims to query the valuable samples from the unlabeled data for annotation to maximize the GNNs' performance at a lower cost. However, most existing algorithms for reinforced active learning in GNNs may lead to a highly imbalanced class distribution, especially in highly skewed class scenarios. GNNs trained with class-imbalanced labeled data are susceptible to bias toward majority classes, and the lower performance of minority classes may lead to a decline in overall performance. To tackle this issue, we propose a novel class-balanced and reinforced active learning framework for GNNs, namely, GCBR. It learns an optimal policy to acquire class-balanced and informative nodes for annotation, maximizing the performance of GNNs trained with selected labeled nodes. GCBR designs class-balance-aware states, as well as a reward function that achieves trade-off between model performance and class balance. The reinforcement learning algorithm Advantage Actor-Critic (A2C) is employed to learn an optimal policy stably and efficiently. We further upgrade GCBR to GCBR++ by introducing a punishment mechanism to obtain a more class-balanced labeled set. Extensive experiments on multiple datasets demonstrate the effectiveness of the proposed approaches, achieving superior performance over state-of-the-art baselines.
翻译:图神经网络(GNN)在节点分类、链接预测和图分类等各类应用中展现出显著成功。面向GNN的主动学习旨在从未标注数据中查询有价值样本进行标注,以较低成本最大化GNN性能。然而,现有大部分针对GNN的强化主动学习算法可能导致高度不平衡的类别分布,尤其在严重偏斜的类别场景下。基于类别不平衡标注数据训练的GNN容易偏向多数类,而少数类的较低性能可能导致整体性能下降。为解决该问题,我们提出了一种新颖的GNN类平衡与强化主动学习框架GCBR。该框架学习最优策略,以获取类别均衡且信息量丰富的节点进行标注,从而最大化使用所选标注节点训练的GNN性能。GCBR设计了可感知类平衡的状态,以及能够实现模型性能与类平衡之间权衡的奖励函数。采用强化学习算法优势行动者-评论家(A2C)稳定高效地学习最优策略。我们进一步通过引入惩罚机制将GCBR升级为GCBR++,以获得更类平衡的标注集。在多个数据集上的广泛实验证明了所提方法的有效性,其性能超越现有最优基线。