Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings. However, it is largely in its infancy due to the inherent complexity and diversity, such as long trajectories, multiple scales and varying dimensions of partial differential equations (PDEs) data. In this paper, we present a new auto-regressive denoising pre-training strategy, which allows for more stable and efficient pre-training on PDE data and generalizes to various downstream tasks. Moreover, by designing a flexible and scalable model architecture based on Fourier attention, we can easily scale up the model for large-scale pre-training. We train our PDE foundation model with up to 0.5B parameters on 10+ PDE datasets with more than 100k trajectories. Extensive experiments show that we achieve SOTA on these benchmarks and validate the strong generalizability of our model to significantly enhance performance on diverse downstream PDE tasks like 3D data. Code is available at \url{https://github.com/thu-ml/DPOT}.
翻译:预训练已被研究用于提升数据稀缺场景下神经算子的训练效率与性能。然而,由于偏微分方程(PDE)数据固有的复杂性与多样性(如长轨迹、多尺度与变维度),预训练技术仍处于早期发展阶段。本文提出一种新型自回归去噪预训练策略,该策略能在PDE数据上实现更稳定高效的预训练,并可泛化至多种下游任务。此外,通过设计基于傅里叶注意力的灵活可扩展模型架构,我们可轻松扩展模型规模以实现大规模预训练。我们在超过10个PDE数据集、10万+轨迹上训练的PDE基础模型参数量高达5亿。大量实验表明,我们在这些基准上取得了最先进(SOTA)性能,并验证了模型在3D数据等多样化下游PDE任务中显著提升性能的强泛化能力。代码已开源至\url{https://github.com/thu-ml/DPOT}。