From Complex Dynamics to DynFormer: Rethinking Transformers for PDEs

Partial differential equations (PDEs) are fundamental for modeling complex physical systems, yet classical numerical solvers face prohibitive computational costs in high-dimensional and multi-scale regimes. While Transformer-based neural operators have emerged as powerful data-driven alternatives, they conventionally treat all discretized spatial points as uniform, independent tokens. This monolithic approach ignores the intrinsic scale separation of physical fields, applying computationally prohibitive global attention that redundantly mixes smooth large-scale dynamics with high-frequency fluctuations. Rethinking Transformers through the lens of complex dynamics, we propose DynFormer, a novel dynamics-informed neural operator. Rather than applying a uniform attention mechanism across all scales, DynFormer explicitly assigns specialized network modules to distinct physical scales. It leverages a Spectral Embedding to isolate low-frequency modes, enabling a Kronecker-structured attention mechanism to efficiently capture large-scale global interactions with reduced complexity. Concurrently, we introduce a Local-Global-Mixing transformation. This module utilizes nonlinear multiplicative frequency mixing to implicitly reconstruct the small-scale, fast-varying turbulent cascades that are slaved to the macroscopic state, without incurring the cost of global attention. Integrating these modules into a hybrid evolutionary architecture ensures robust long-term temporal stability. Extensive memory-aligned evaluations across four PDE benchmarks demonstrate that DynFormer achieves up to a 95% reduction in relative error compared to state-of-the-art baselines, while significantly reducing GPU memory consumption. Our results establish that embedding first-principles physical dynamics into Transformer architectures yields a highly scalable, theoretically grounded blueprint for PDE surrogate modeling.

翻译：偏微分方程是建模复杂物理系统的基石，然而经典数值求解器在高维与多尺度场景下面临着难以承受的计算成本。尽管基于Transformer的神经算子已成为强大的数据驱动替代方案，但它们传统上将所有离散空间点视为均匀独立的标记。这种单一化方法忽略了物理场固有的尺度分离特性，采用计算代价高昂的全局注意力机制，冗余地混合了平滑的大尺度动力学与高频涨落。通过复杂动力学的视角重新审视Transformer，我们提出DynFormer——一种新型动力学启发的神经算子。DynFormer并非在所有尺度上应用统一的注意力机制，而是显式地为不同物理尺度分配专用网络模块。它通过谱嵌入技术分离低频模态，利用克罗内克结构注意力机制以降低的复杂度高效捕捉大尺度全局相互作用。同时，我们引入局部-全局混合变换模块。该模块利用非线性乘性频率混合，隐式重建受宏观状态支配的小尺度快速变化湍流级串，而无需承担全局注意力的计算代价。将这些模块集成到混合演化架构中，确保了鲁棒的长期时间稳定性。在四个偏微分方程基准测试中进行的广泛内存对齐评估表明，与最先进的基线方法相比，DynFormer实现了高达95%的相对误差降低，同时显著减少了GPU内存消耗。我们的研究结果证明，将第一性原理物理动力学嵌入Transformer架构，为偏微分方程代理建模提供了高度可扩展且理论依据充分的蓝图。