多物理场预训练：面向物理代理模型的通用方法 (Multiple Physics Pretraining for Physical Surrogate Models)

Michael McCabe,Bruno Régaldo-Saint Blancard,Liam Holden Parker,Ruben Ohana,Miles Cranmer,Alberto Bietti,Michael Eickenberg,Siavash Golkar,Geraud Krawezik,Francois Lanusse,Mariel Pettee,Tiberiu Tesileanu,Kyunghyun Cho,Shirley Ho

We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling of spatiotemporal systems with transformers. In MPP, rather than training one model on a specific physical system, we train a backbone model to predict the dynamics of multiple heterogeneous physical systems simultaneously in order to learn features that are broadly useful across systems and facilitate transfer. In order to learn effectively in this setting, we introduce a shared embedding and normalization strategy that projects the fields of multiple systems into a shared embedding space. We validate the efficacy of our approach on both pretraining and downstream tasks over a broad fluid mechanics-oriented benchmark. We show that a single MPP-pretrained transformer is able to match or outperform task-specific baselines on all pretraining sub-tasks without the need for finetuning. For downstream tasks, we demonstrate that finetuning MPP-trained models results in more accurate predictions across multiple time-steps on systems with previously unseen physical components or higher dimensional systems compared to training from scratch or finetuning pretrained video foundation models. We open-source our code and model weights trained at multiple scales for reproducibility.

翻译：本文提出多物理场预训练（MPP），一种面向时空系统物理代理建模的自回归任务无关预训练方法，基于Transformer架构。在MPP中，我们不再针对特定物理系统训练单一模型，而是训练一个骨干模型同时预测多个异构物理系统的动力学行为，从而学习跨系统通用的特征表示并促进知识迁移。为在此设定下实现高效学习，我们提出共享嵌入与归一化策略，将多系统的物理场投影至共享嵌入空间。我们在涵盖广泛流体力学场景的基准测试中，验证了该方法在预训练及下游任务上的有效性。实验表明：单一经MPP预训练的Transformer无需微调即可在所有预训练子任务上达到或超越任务专用基线模型的性能。对于下游任务，我们证明相较于从头训练或微调预训练视频基础模型，对MPP训练后的模型进行微调，能够在包含未见物理组件或更高维度的系统上，实现多时间步预测精度的显著提升。为促进可复现性，我们开源了代码及多尺度训练所得的模型权重。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日