Since their introduction by Kipf and Welling in $2017$, a primary use of graph convolutional networks is transductive node classification, where missing labels are inferred within a single observed graph and its feature matrix. Despite the widespread use of the network model, the statistical foundations of transductive learning remain limited, as standard inference frameworks typically rely on multiple independent samples rather than a single graph. In this work, we address these gaps by developing new concentration-of-measure tools that leverage the geometric regularities of large graphs via low-dimensional metric embeddings. The emergent regularities are captured using a random graph model; however, the methods remain applicable to deterministic graphs once observed. We establish two principal learning results. The first concerns arbitrary deterministic $k$-vertex graphs, and the second addresses random graphs that share key geometric properties with an Erd\H{o}s-R\'{e}nyi graph $\mathbf{G}=\mathbf{G}(k,p)$ in the regime $p \in \mathcal{O}((\log (k)/k)^{1/2})$. The first result serves as the basis for and illuminates the second. We then extend these results to the graph convolutional network setting, where additional challenges arise. Lastly, our learning guarantees remain informative even with a few labelled nodes $N$ and achieve the optimal nonparametric rate $\mathcal{O}(N^{-1/2})$ as $N$ grows.
翻译:自Kipf和Welling于$2017$年提出以来,图卷积网络的主要应用之一是归纳式节点分类,即在单个观测图及其特征矩阵中推断缺失标签。尽管该网络模型得到广泛应用,但归纳学习的统计基础仍然有限,因为标准推理框架通常依赖多个独立样本而非单一图。本研究通过开发新的集中度量工具来填补这些空白,这些工具利用低维度量嵌入捕捉大型图的几何规律性。这些涌现的规律性通过随机图模型进行刻画;然而,该方法在观测后同样适用于确定性图。我们建立了两个主要学习结果:第一个针对任意确定性$k$顶点图,第二个针对与Erd\H{o}s-R\'{e}nyi图$\mathbf{G}=\mathbf{G}(k,p)$在$p \in \mathcal{O}((\log (k)/k)^{1/2})$区间内具有关键几何特性的随机图。第一个结果为第二个结果提供了基础并阐明了其机理。随后我们将这些结果扩展到图卷积网络场景,该场景会引发额外挑战。最后,即使仅使用少量标记节点$N$,我们的学习保证仍具有信息量,并在$N$增长时达到最优非参数速率$\mathcal{O}(N^{-1/2})$。