Spatial-Temporal Graph (STG) forecasting on large-scale networks has garnered significant attention. However, existing models predominantly focus on short-horizon predictions and suffer from notorious computational costs and memory consumption when scaling to long-horizon predictions and large graphs. Targeting the above challenges, we present FaST, an effective and efficient framework based on heterogeneity-aware Mixture-of-Experts (MoEs) for long-horizon and large-scale STG forecasting, which unlocks one-week-ahead (672 steps at a 15-minute granularity) prediction with thousands of nodes. FaST is underpinned by two key innovations. First, an adaptive graph agent attention mechanism is proposed to alleviate the computational burden inherent in conventional graph convolution and self-attention modules when applied to large-scale graphs. Second, we propose a new parallel MoE module that replaces traditional feed-forward networks with Gated Linear Units (GLUs), enabling an efficient and scalable parallel structure. Extensive experiments on real-world datasets demonstrate that FaST not only delivers superior long-horizon predictive accuracy but also achieves remarkable computational efficiency compared to state-of-the-art baselines. Our source code is available at: https://github.com/yijizhao/FaST.
翻译:大规模时空图预测已引起广泛关注。然而,现有模型主要集中于短时程预测,且在扩展至长时程预测和大规模图结构时面临显著的计算开销与内存消耗问题。针对上述挑战,本文提出FaST——一种基于异质性感知专家混合的高效有效框架,用于大规模时空图的长时程预测,实现了包含数千节点、以15分钟为粒度的未来一周(672步)预测。FaST的核心创新包含两个方面:首先,提出自适应图代理注意力机制,以缓解传统图卷积与自注意力模块应用于大规模图时固有的计算负担;其次,设计新型并行专家混合模块,采用门控线性单元替代传统前馈网络,构建高效可扩展的并行架构。在真实数据集上的大量实验表明,FaST不仅具有优异的长时程预测精度,相比现有先进基线方法更实现了显著的计算效率提升。项目源代码已公开于:https://github.com/yijizhao/FaST。