We introduce MORPH, a modality-agnostic, autoregressive foundation model for partial differential equations (PDEs). MORPH is built on a convolutional vision transformer backbone that seamlessly handles heterogeneous spatiotemporal datasets of varying data modality (1D--3D) at different resolutions, and multiple fields with mixed scalar and vector components. The architecture combines (i) component-wise convolution, which jointly processes scalar and vector channels to capture local interactions, (ii) inter-field cross-attention, which models and selectively propagates information between different physical fields, (iii) axial attentions, which factorize full spatiotemporal self-attention along individual spatial and temporal axes to reduce computational burden while retaining expressivity. We pretrain multiple model variants on a diverse collection of heterogeneous PDE datasets and evaluate transfer to a range of downstream prediction tasks. Using both full-model fine-tuning and parameter-efficient low-rank adapters, MORPH outperforms models trained from scratch. Across extensive evaluations, MORPH matches or surpasses strong baselines and recent state-of-the-art models. Collectively, these capabilities present a flexible and powerful backbone for learning from the heterogeneous and multimodal nature of scientific observations, charting a path toward scalable and data-efficient scientific machine learning. The source code, datasets, and models are publicly available at https://github.com/lanl/MORPH.
翻译:本文介绍MORPH,一种模态无关的自回归偏微分方程基础模型。该模型基于卷积视觉Transformer架构构建,能够无缝处理不同分辨率(1D-3D)的异构时空数据集,以及包含标量与矢量分量的多物理场数据。其架构融合三大核心组件:(i)分量卷积模块——通过联合处理标量与矢量通道捕捉局部相互作用;(ii)场间交叉注意力机制——建模并选择性传递不同物理场间的信息;(iii)轴向注意力机制——将完整时空自注意力分解为独立的空间轴与时间轴计算,在保持表达力的同时降低计算负担。我们在异构偏微分方程数据集集合上预训练了多个模型变体,并评估其在下游预测任务中的迁移性能。通过全模型微调与参数高效的低秩适配器两种方式,MORPH均优于从头训练的模型。在大量评估实验中,MORPH达到或超越了强基线模型与近期最先进模型的性能。这些能力共同构成了一个灵活而强大的学习框架,能够有效处理科学观测数据固有的异构性与多模态特性,为可扩展且数据高效的科学机器学习开辟了新路径。源代码、数据集与模型已公开于https://github.com/lanl/MORPH。