We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling. MPP involves training large surrogate models to predict the dynamics of multiple heterogeneous physical systems simultaneously by learning features that are broadly useful across diverse physical tasks. In order to learn effectively in this setting, we introduce a shared embedding and normalization strategy that projects the fields of multiple systems into a single shared embedding space. We validate the efficacy of our approach on both pretraining and downstream tasks over a broad fluid mechanics-oriented benchmark. We show that a single MPP-pretrained transformer is able to match or outperform task-specific baselines on all pretraining sub-tasks without the need for finetuning. For downstream tasks, we demonstrate that finetuning MPP-trained models results in more accurate predictions across multiple time-steps on new physics compared to training from scratch or finetuning pretrained video foundation models. We open-source our code and model weights trained at multiple scales for reproducibility and community experimentation.
翻译:我们提出了多物理预训练(MPP),一种用于物理代理建模的自回归任务无关预训练方法。MPP通过联合学习跨多样物理任务广泛有效的特征,训练大型代理模型同时预测多个异构物理系统的动力学行为。为在此场景下实现高效学习,我们引入了一种共享嵌入与归一化策略,将多个系统的场映射到单一共享嵌入空间。我们在以流体力学为重点的广泛基准上验证了该方法在预训练和下游任务中的有效性。结果表明,单个MPP预训练变换器无需微调即可匹配或超越所有预训练子任务的任务特定基线。针对下游任务,我们证明在新物理场景中,对MPP预训练模型进行微调可在多时间步上获得比从头训练或微调预训练视频基础模型更准确的预测。我们开源了多尺度训练后的代码与模型权重,以促进可重复性和社区实验。