Graph Neural Networks (GNNs) have shown promising potential in graph representation learning. The majority of GNNs define a local message-passing mechanism, propagating information over the graph by stacking multiple layers. These methods, however, are known to suffer from two major limitations: over-squashing and poor capturing of long-range dependencies. Recently, Graph Transformers (GTs) emerged as a powerful alternative to Message-Passing Neural Networks (MPNNs). GTs, however, have quadratic computational cost, lack inductive biases on graph structures, and rely on complex Positional/Structural Encodings (SE/PE). In this paper, we show that while Transformers, complex message-passing, and SE/PE are sufficient for good performance in practice, neither is necessary. Motivated by the recent success of State Space Models (SSMs), such as Mamba, we present Graph Mamba Networks (GMNs), a general framework for a new class of GNNs based on selective SSMs. We discuss and categorize the new challenges when adopting SSMs to graph-structured data, and present four required and one optional steps to design GMNs, where we choose (1) Neighborhood Tokenization, (2) Token Ordering, (3) Architecture of Bidirectional Selective SSM Encoder, (4) Local Encoding, and dispensable (5) PE and SE. We further provide theoretical justification for the power of GMNs. Experiments demonstrate that despite much less computational cost, GMNs attain an outstanding performance in long-range, small-scale, large-scale, and heterophilic benchmark datasets.
翻译:图神经网络(GNNs)在图表示学习中展现出巨大潜力。大多数GNNs采用局部消息传递机制,通过堆叠多层网络在图上传播信息。然而,这些方法存在两大局限性:过度挤压和长程依赖捕获能力不足。近年来,图Transformer(GTs)作为消息传递神经网络(MPNNs)的有力替代方案兴起。但GTs存在二次计算复杂度、缺乏图结构归纳偏置以及依赖复杂的位置/结构编码(SE/PE)等问题。本文研究表明,尽管Transformer、复杂消息传递机制及SE/PE在实际应用中能取得良好性能,但并非不可或缺。受状态空间模型(SSMs)的最新进展(如Mamba)启发,我们提出图曼巴网络(GMNs),这是基于选择性SSMs的新型GNNs通用框架。我们系统讨论并分类了将SSMs应用于图结构数据时面临的新挑战,并提出设计GMNs的四个必要步骤和一项可选步骤,具体包括:(1)邻域标记化、(2)标记排序、(3)双向选择性SSM编码器架构、(4)局部编码,以及可选的(5)位置/结构编码。我们进一步为GMNs的效能提供了理论依据。实验表明,尽管计算成本大幅降低,GMNs在长程、小规模、大规模及异嗜性基准数据集上均取得了卓越性能。