Diffusion-Jump GNNs: Homophiliation via Learnable Metric Filters

High-order Graph Neural Networks (HO-GNNs) have been developed to infer consistent latent spaces in the heterophilic regime, where the label distribution is not correlated with the graph structure. However, most of the existing HO-GNNs are hop-based, i.e., they rely on the powers of the transition matrix. As a result, these architectures are not fully reactive to the classification loss and the achieved structural filters have static supports. In other words, neither the filters' supports nor their coefficients can be learned with these networks. They are confined, instead, to learn combinations of filters. To address the above concerns, we propose Diffusion-jump GNNs a method relying on asymptotic diffusion distances that operates on jumps. A diffusion-pump generates pairwise distances whose projections determine both the support and coefficients of each structural filter. These filters are called jumps because they explore a wide range of scales in order to find bonds between scattered nodes with the same label. Actually, the full process is controlled by the classification loss. Both the jumps and the diffusion distances react to classification errors (i.e. they are learnable). Homophiliation, i.e., the process of learning piecewise smooth latent spaces in the heterophilic regime, is formulated as a Dirichlet problem: the known labels determine the border nodes and the diffusion-pump ensures a minimal deviation of the semi-supervised grouping from a canonical unsupervised grouping. This triggers the update of both the diffusion distances and, consequently, the jumps in order to minimize the classification error. The Dirichlet formulation has several advantages. It leads to the definition of structural heterophily, a novel measure beyond edge heterophily. It also allows us to investigate links with (learnable) diffusion distances, absorbing random walks and stochastic diffusion.

翻译：高阶图神经网络（HO-GNNs）旨在推断标签分布与图结构不相关的异质性场景中的一致潜在空间。然而，现有大多数HO-GNNs基于跳数方法，即依赖转移矩阵的幂运算。这导致这些架构无法完全响应分类损失，其结构滤波器具有静态支撑集。换言之，这些网络既无法学习滤波器的支撑集也无法学习其系数，仅能学习滤波器的线性组合。为解决上述问题，我们提出扩散跳跃图神经网络——一种基于渐进扩散距离在跳跃上运行的方法。扩散泵生成成对距离，其投影同时决定每个结构滤波器的支撑集和系数。这些滤波器被称为跳跃，因为它们探索广泛尺度范围以寻找散落同标签节点间的联结。事实上，整个流程由分类损失控制：跳跃和扩散距离均能响应分类误差（即具有可学习性）。同质性化——即在异质性场景中学习分段平滑潜在空间的过程——被形式化为狄利克雷问题：已知标签确定边界节点，扩散泵确保半监督分组相对于标准无监督分组具有最小偏差。该机制触发扩散距离及相应跳跃的更新，从而最小化分类误差。狄利克雷公式具有多重优势：它定义了结构异质性——一种超越边异质性的新型度量，同时使我们得以探究与（可学习）扩散距离、吸收随机游走及随机扩散之间的关联。