Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings. However, it is largely in its infancy due to the inherent complexity and diversity, such as long trajectories, multiple scales and varying dimensions of partial differential equations (PDEs) data. In this paper, we present a new auto-regressive denoising pre-training strategy, which allows for more stable and efficient pre-training on PDE data and generalizes to various downstream tasks. Moreover, by designing a flexible and scalable model architecture based on Fourier attention, we can easily scale up the model for large-scale pre-training. We train our PDE foundation model with up to 0.5B parameters on 10+ PDE datasets with more than 100k trajectories. Extensive experiments show that we achieve SOTA on these benchmarks and validate the strong generalizability of our model to significantly enhance performance on diverse downstream PDE tasks like 3D data. Code is available at \url{https://github.com/thu-ml/DPOT}.
翻译:预训练已被研究用于提升数据稀缺场景下神经算子训练的效率与性能。然而,由于偏微分方程数据固有的复杂性和多样性(例如长轨迹、多尺度及变维度问题),该领域仍处于初期阶段。本文提出了一种新颖的自回归去噪预训练策略,该策略可在PDE数据上实现更稳定高效的预训练,并泛化至各类下游任务。此外,通过基于傅里叶注意力设计灵活可扩展的模型架构,我们能够轻松扩展模型规模以支持大规模预训练。我们在含10余个PDE数据集(总计超10万条轨迹)上训练了参数量达5亿的PDE基础模型。大量实验表明,该方法在相关基准测试中达到最优性能,并验证了模型在三维数据等多样化下游PDE任务中的强泛化能力。代码开源于\url{https://github.com/thu-ml/DPOT}。