Graph Neural Networks (GNNs) have shown promising potential in graph representation learning. The majority of GNNs define a local message-passing mechanism, propagating information over the graph by stacking multiple layers. These methods, however, are known to suffer from two major limitations: over-squashing and poor capturing of long-range dependencies. Recently, Graph Transformers (GTs) emerged as a powerful alternative to Message-Passing Neural Networks (MPNNs). GTs, however, have quadratic computational cost, lack inductive biases on graph structures, and rely on complex Positional/Structural Encodings (SE/PE). In this paper, we show that while Transformers, complex message-passing, and SE/PE are sufficient for good performance in practice, neither is necessary. Motivated by the recent success of State Space Models (SSMs), such as Mamba, we present Graph Mamba Networks (GMNs), a general framework for a new class of GNNs based on selective SSMs. We discuss and categorize the new challenges when adapting SSMs to graph-structured data, and present four required and one optional steps to design GMNs, where we choose (1) Neighborhood Tokenization, (2) Token Ordering, (3) Architecture of Bidirectional Selective SSM Encoder, (4) Local Encoding, and dispensable (5) PE and SE. We further provide theoretical justification for the power of GMNs. Experiments demonstrate that despite much less computational cost, GMNs attain an outstanding performance in long-range, small-scale, large-scale, and heterophilic benchmark datasets.
翻译:图神经网络(GNN)在图表示学习中展现出巨大潜力。大多数GNN采用局部消息传递机制,通过堆叠多层网络在图结构上传播信息。然而,这类方法存在两大局限:过度挤压效应和长程依赖捕获能力不足。近年来,图Transformer(GTs)作为消息传递神经网络(MPNNs)的有力替代方案崭露头角。但GTs存在二次计算复杂度、缺乏图结构归纳偏置、依赖复杂位置/结构编码(SE/PE)等问题。本文表明,尽管Transformer、复杂消息传递机制及SE/PE在实际应用中足以获得良好性能,但它们并非必要条件。受状态空间模型(SSMs)如Mamba近期成功应用的启发,我们提出图Mamba网络(GMNs)——基于选择性SSMs的新型GNN通用框架。系统论述了将SSMs适配到图结构数据时面临的新挑战,并归纳出设计GMNs所需的四个必备步骤与一个可选步骤:(1)邻域符号化、(2)符号排序、(3)双向选择性SSM编码器架构、(4)局部编码,以及可选的(5)位置编码与结构编码。同时提供GMNs能力的理论证明。实验表明,尽管计算成本大幅降低,GMNs在长程、小规模、大规模及异嗜性基准数据集上均展现出卓越性能。