Latent graph inference (LGI) aims to jointly learn the underlying graph structure and node representations from data features. However, existing LGI methods commonly suffer from the issue of supervision starvation, where massive edge weights are learned without semantic supervision and do not contribute to the training loss. Consequently, these supervision-starved weights, which may determine the predictions of testing samples, cannot be semantically optimal, resulting in poor generalization. In this paper, we observe that this issue is actually caused by the graph sparsification operation, which severely destroys the important connections established between pivotal nodes and labeled ones. To address this, we propose to restore the corrupted affinities and replenish the missed supervision for better LGI. The key challenge then lies in identifying the critical nodes and recovering the corrupted affinities. We begin by defining the pivotal nodes as $k$-hop starved nodes, which can be identified based on a given adjacency matrix. Considering the high computational burden, we further present a more efficient alternative inspired by CUR matrix decomposition. Subsequently, we eliminate the starved nodes by reconstructing the destroyed connections. Extensive experiments on representative benchmarks demonstrate that reducing the starved nodes consistently improves the performance of state-of-the-art LGI methods, especially under extremely limited supervision (6.12% improvement on Pubmed with a labeling rate of only 0.3%).
翻译:隐图推断(LGI)旨在从数据特征中联合学习底层图结构与节点表示。然而,现有LGI方法普遍面临监督匮乏问题:大量边权重在无语义监督条件下学习,且不参与训练损失计算。这些缺乏监督的权重可能决定测试样本预测结果,却无法达到语义最优,导致泛化能力下降。本文发现,该问题实由图稀疏化操作引发——该操作严重破坏了关键节点与标记节点间的重要连接。对此,我们提出修复受损关联并补充缺失监督以改进LGI。核心挑战在于识别关键节点与恢复受损关联。首先将关键节点定义为基于给定邻接矩阵可识别的k跳匮乏节点。考虑到计算开销,我们进一步提出受CUR矩阵分解启发的更高效替代方案。通过重建被破坏的连接消除匮乏节点。在代表性基准上的大量实验表明,减少匮乏节点能持续提升最先进LGI方法的性能,尤其在极端有限监督条件下(在标注率仅0.3%的Pubmed数据集上提升6.12%)。