Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings. However, it is largely in its infancy due to the inherent complexity and diversity, such as long trajectories, multiple scales and varying dimensions of partial differential equations (PDEs) data. In this paper, we present a new auto-regressive denoising pre-training strategy, which allows for more stable and efficient pre-training on PDE data and generalizes to various downstream tasks. Moreover, by designing a flexible and scalable model architecture based on Fourier attention, we can easily scale up the model for large-scale pre-training. We train our PDE foundation model with up to 0.5B parameters on 10+ PDE datasets with more than 100k trajectories. Extensive experiments show that we achieve SOTA on these benchmarks and validate the strong generalizability of our model to significantly enhance performance on diverse downstream PDE tasks like 3D data. Code is available at \url{https://github.com/thu-ml/DPOT}.
翻译:预训练技术已被研究用于提升数据稀缺场景下训练神经算子的效率与性能。然而,由于偏微分方程数据固有的复杂性与多样性——如长轨迹、多尺度及变维度特性,该领域仍处于早期阶段。本文提出一种新型自回归去噪预训练策略,可在PDE数据上实现更稳定高效的预训练,并泛化至各类下游任务。此外,通过设计基于傅里叶注意力的灵活可扩展模型架构,我们能够轻松扩展模型规模以实现大规模预训练。我们在包含10余个PDE数据集、超过10万条轨迹的基准上训练了参数量达5亿的PDE基础模型。大量实验表明,该方法在这些基准上达到了最先进性能,并验证了模型在3D数据等多样化下游PDE任务中的强泛化能力。代码已发布于\url{https://github.com/thu-ml/DPOT}。