Node classification is the task of predicting the labels of unlabeled nodes in a graph. State-of-the-art methods based on graph neural networks achieve excellent performance when all labels are available during training. But in real-life, models are often applied on data with new classes, which can lead to massive misclassification and thus significantly degrade performance. Hence, developing open-set classification methods is crucial to determine if a given sample belongs to a known class. Existing methods for open-set node classification generally use transductive learning with part or all of the features of real unseen class nodes to help with open-set classification. In this paper, we propose a novel generative open-set node classification method, i.e. $\mathcal{G}^2Pxy$, which follows a stricter inductive learning setting where no information about unknown classes is available during training and validation. Two kinds of proxy unknown nodes, inter-class unknown proxies and external unknown proxies are generated via mixup to efficiently anticipate the distribution of novel classes. Using the generated proxies, a closed-set classifier can be transformed into an open-set one, by augmenting it with an extra proxy classifier. Under the constraints of both cross entropy loss and complement entropy loss, $\mathcal{G}^2Pxy$ achieves superior effectiveness for unknown class detection and known class classification, which is validated by experiments on benchmark graph datasets. Moreover, $\mathcal{G}^2Pxy$ does not have specific requirement on the GNN architecture and shows good generalizations.
翻译:节点分类是预测图中未标记节点标签的任务。基于图神经网络的最先进方法在训练过程中所有标签均可用时表现出色。然而实际应用中,模型常面对包含新类别的数据,这可能导致大量错误分类并严重降低性能。因此,开发开放集分类方法以判断给定样本是否属于已知类别至关重要。现有开放集节点分类方法通常采用直推式学习,利用真实未见类节点的部分或全部特征辅助开放集分类。本文提出了一种新颖的生成式开放集节点分类方法$\mathcal{G}^2Pxy$,该方法遵循更严格的归纳式学习设置,在训练和验证过程中完全无法获取未知类别的信息。通过混合技术生成两类代理未知节点——类间未知代理和外部未知代理,以高效预判新类别的分布。利用生成的代理节点,通过添加额外的代理分类器可将闭集分类器转换为开放集分类器。在交叉熵损失和互补熵损失的双重约束下,$\mathcal{G}^2Pxy$在未知类检测和已知类分类方面均展现出卓越效果,基准图数据集上的实验验证了其有效性。此外,$\mathcal{G}^2Pxy$对图神经网络架构无特定要求,具有良好的泛化能力。