Dynamic graph learning plays a pivotal role in modeling evolving relationships over time, especially for temporal link prediction tasks in domains such as traffic systems, social networks, and recommendation platforms. While Transformer-based models have demonstrated strong performance by capturing long-range temporal dependencies, their reliance on self-attention results in quadratic complexity with respect to sequence length, limiting scalability on high-frequency or large-scale graphs. In this work, we revisit the necessity of self-attention in dynamic graph modeling. Inspired by recent findings that attribute the success of Transformers more to their architectural design than attention itself, we propose GLFormer, a novel attention-free Transformer-style framework for dynamic graphs. GLFormer introduces an adaptive token mixer that performs context-aware local aggregation based on interaction order and time intervals. To capture long-term dependencies, we further design a hierarchical aggregation module that expands the temporal receptive field by stacking local token mixers across layers. Experiments on six widely-used dynamic graph benchmarks show that GLFormer achieves SOTA performance, which reveals that attention-free architectures can match or surpass Transformer baselines in dynamic graph settings with significantly improved efficiency.
翻译:动态图学习在建模随时间演化的关系方面发挥着关键作用,尤其适用于交通系统、社交网络和推荐平台等领域的时序链接预测任务。尽管基于Transformer的模型通过捕捉长程时序依赖展现了强大性能,但其对自注意力机制的依赖导致计算复杂度随序列长度呈二次方增长,限制了在高频或大规模图上的可扩展性。本文重新审视了自注意力在动态图建模中的必要性。受近期研究的启发——这些研究认为Transformer的成功更多归因于其架构设计而非注意力机制本身——我们提出了GLFormer,一种面向动态图的新型无注意力Transformer风格框架。GLFormer引入了自适应令牌混合器,该混合器基于交互顺序和时间间隔执行上下文感知的局部聚合。为捕捉长期依赖,我们进一步设计了分层聚合模块,通过跨层堆叠局部令牌混合器来扩展时序感受野。在六个广泛使用的动态图基准测试上的实验表明,GLFormer实现了最先进的性能,这揭示了在动态图场景中,无注意力架构能够匹配甚至超越Transformer基线模型,同时显著提升计算效率。