In the realm of graph learning, there is a category of methods that conceptualize graphs as hierarchical structures, utilizing node clustering to capture broader structural information. While generally effective, these methods often rely on a fixed graph coarsening routine, leading to overly homogeneous cluster representations and loss of node-level information. In this paper, we envision the graph as a network of interconnected node sets without compressing each cluster into a single embedding. To enable effective information transfer among these node sets, we propose the Node-to-Cluster Attention (N2C-Attn) mechanism. N2C-Attn incorporates techniques from Multiple Kernel Learning into the kernelized attention framework, effectively capturing information at both node and cluster levels. We then devise an efficient form for N2C-Attn using the cluster-wise message-passing framework, achieving linear time complexity. We further analyze how N2C-Attn combines bi-level feature maps of queries and keys, demonstrating its capability to merge dual-granularity information. The resulting architecture, Cluster-wise Graph Transformer (Cluster-GT), which uses node clusters as tokens and employs our proposed N2C-Attn module, shows superior performance on various graph-level tasks. Code is available at https://github.com/LUMIA-Group/Cluster-wise-Graph-Transformer.
翻译:在图学习领域中,存在一类方法将图概念化为层次结构,利用节点聚类来捕获更广泛的结构信息。尽管这些方法通常有效,但它们往往依赖于固定的图粗化流程,导致簇表示过于同质化并丢失节点级信息。本文中,我们将图设想为相互连接的节点集合网络,而不将每个簇压缩为单一嵌入。为实现这些节点集合间的有效信息传递,我们提出了节点到簇注意力机制。该机制将多核学习技术融入核化注意力框架,有效捕获节点和簇两个层级的信息。随后,我们基于簇间消息传递框架设计了该机制的高效形式,实现了线性时间复杂度。我们进一步分析了该机制如何融合查询与键的双层级特征映射,论证了其合并双粒度信息的能力。所构建的基于簇的图Transformer架构以节点簇作为标记并采用我们提出的节点到簇注意力模块,在多种图级任务上展现出优越性能。代码发布于https://github.com/LUMIA-Group/Cluster-wise-Graph-Transformer。