Graph Neural Networks (GNNs) have gained considerable attention for their potential in addressing challenges posed by complex graph-structured data in diverse domains. However, accurately annotating graph data for training is difficult due to the inherent complexity and interconnectedness of graphs. To tackle this issue, we propose a novel graph representation learning method that enables GNN models to effectively learn discriminative information even in the presence of noisy labels within the context of Partially Labeled Learning (PLL). PLL is a critical weakly supervised learning problem, where each training instance is associated with a set of candidate labels, including both the true label and additional noisy labels. Our approach leverages potential cause extraction to obtain graph data that exhibit a higher likelihood of possessing a causal relationship with the labels. By incorporating auxiliary training based on the extracted graph data, our model can effectively filter out the noise contained in the labels. We support the rationale behind our approach with a series of theoretical analyses. Moreover, we conduct extensive evaluations and ablation studies on multiple datasets, demonstrating the superiority of our proposed method.
翻译:图神经网络(GNNs)因其在处理复杂图结构数据方面应对各领域挑战的潜力而备受关注。然而,由于图数据的固有复杂性和相互关联性,精确标注图数据用于训练十分困难。为解决此问题,我们提出了一种新颖的图表示学习方法,使GNN模型即使在部分标签学习(PLL)场景中存在噪声标签的情况下,也能有效学习判别性信息。PLL是一种关键的弱监督学习问题,其中每个训练实例对应一组候选标签集,既包含真实标签也包含额外噪声标签。我们的方法通过提取潜在原因来获取与标签具有更高因果关联可能性的图数据。通过基于所提取图数据的辅助训练,我们的模型能有效过滤标签中的噪声。我们通过一系列理论分析为该方法背后的原理提供了支撑。此外,我们在多个数据集上进行了广泛的评估和消融研究,证明了所提方法的优越性。