Many existing transductive bounds rely on classical complexity measures that are computationally intractable and often misaligned with empirical behavior. In this work, we establish new representation-based generalization bounds in a distribution-free transductive setting, where learned representations are dependent, and test features are accessible during training. We derive global and class-wise bounds via optimal transport, expressed in terms of Wasserstein distances between encoded feature distributions. We demonstrate that our bounds are efficiently computable and strongly correlate with empirical generalization in graph node classification, improving upon classical complexity measures. Additionally, our analysis reveals how the GNN aggregation process transforms the representation distributions, inducing a trade-off between intra-class concentration and inter-class separation. This yields depth-dependent characterizations that capture the non-monotonic relationship between depth and generalization error observed in practice. The code is available at https://github.com/ml-postech/Transductive-OT-Gen-Bound.
翻译:现有许多转导界依赖于经典复杂度度量,这些度量在计算上难以处理且常与实证行为不符。本研究在分布无关的转导设定中建立了新的基于表示的泛化界,其中学习到的表示具有依赖性,且训练期间可访问测试特征。我们通过最优运输推导了全局和类别的泛化界,以编码特征分布间的Wasserstein距离表示。我们证明所提出的界可高效计算,并在图节点分类任务中与实证泛化表现高度相关,相较于经典复杂度度量具有显著改进。此外,我们的分析揭示了图神经网络聚合过程如何改变表示分布,引发类内集中度与类间分离度的权衡。这产生了深度依赖的特性描述,能够捕捉实践中观察到的网络深度与泛化误差之间的非单调关系。代码发布于https://github.com/ml-postech/Transductive-OT-Gen-Bound。