Graph Neural Networks (GNNs) are increasingly becoming the favorite method for graph learning. They exploit the semi-supervised nature of deep learning, and they bypass computational bottlenecks associated with traditional graph learning methods. In addition to the feature matrix $X$, GNNs need an adjacency matrix $A$ to perform feature propagation. In many cases, the adjacency matrix $A$ is missing. We introduce a graph construction scheme that constructs the adjacency matrix $A$ using unsupervised and supervised information. Unsupervised information characterizes the neighborhood around points. We used Principal Axis trees (PA-trees) as a source for unsupervised information, where we create edges between points falling onto the same leaf node. For supervised information, we used the concept of penalty and intrinsic graphs. A penalty graph connects points with different class labels, whereas an intrinsic graph connects points with the same class labels. We used the penalty and intrinsic graphs to remove or add edges to the graph constructed via PA-tree. We tested this graph construction scheme on two well-known GNNs: 1) Graph Convolutional Network (GCN) and 2) Simple Graph Convolution (SGC). The experiments show that it is better to use SGC because it is faster and delivers better or the same results as GCN. We also test the effect of oversmoothing on both GCN and SGC. We found out that the level of smoothing has to be carefully selected for SGC to avoid oversmoothing.
翻译:图神经网络(GNNs)正逐渐成为图学习领域最受欢迎的方法。它们利用深度学习的半监督特性,并绕过了传统图学习方法中存在的计算瓶颈。除特征矩阵 $X$ 外,GNNs 还需要邻接矩阵 $A$ 来进行特征传播。在许多场景中,邻接矩阵 $A$ 是缺失的。我们提出了一种图构建方案,该方案利用无监督信息和监督信息构建邻接矩阵 $A$。无监督信息用于表征点周围的邻域结构。我们采用主成分轴树(PA-trees)作为无监督信息的来源,其中将落入同一叶节点的点之间建立边连接。对于监督信息,我们引入惩罚图与内在图的概念:惩罚图连接不同类标签的点,而内在图连接同类标签的点。通过惩罚图和内在图,我们对基于 PA-tree 构建的图进行边的删除或添加。我们在两种主流 GNN 上验证了该图构建方案:1) 图卷积网络(GCN)和 2) 简单图卷积(SGC)。实验表明,使用 SGC 更为优越,因其计算速度更快,且能获得与 GCN 相同或更优的结果。我们还研究了过度平滑对 GCN 和 SGC 的影响,发现需要谨慎选择 SGC 的平滑程度以避免过度平滑。