RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data

Peiyan Hu,Haodong Feng,Hongyuan Liu,Tongtong Yan,Wenhao Deng,Tianrun Gao,Rong Zheng,Haoren Zheng,Chenglei Yu,Chuanrui Wang,Kaiwen Li,Zhi-Ming Ma,Dezhi Zhou,Xingcai Lu,Dixia Fan,Tailin Wu

from arxiv, iclr26 oral; 46 pages, 21 figures

Predicting the evolution of complex physical systems remains a central problem in science and engineering. Despite rapid progress in scientific Machine Learning (ML) models, a critical bottleneck is the lack of expensive real-world data, resulting in most current models being trained and validated on simulated data. Beyond limiting the development and evaluation of scientific ML, this gap also hinders research into essential tasks such as sim-to-real transfer. We introduce RealPDEBench, the first benchmark for scientific ML that integrates real-world measurements with paired numerical simulations. RealPDEBench consists of five datasets, three tasks, eight metrics, and ten baselines. We first present five real-world measured datasets with paired simulated datasets across different complex physical systems. We further define three tasks, which allow comparisons between real-world and simulated data, and facilitate the development of methods to bridge the two. Moreover, we design eight evaluation metrics, spanning data-oriented and physics-oriented metrics, and finally benchmark ten representative baselines, including state-of-the-art models, pretrained PDE foundation models, and a traditional method. Experiments reveal significant discrepancies between simulated and real-world data, while showing that pretraining with simulated data consistently improves both accuracy and convergence. In this work, we hope to provide insights from real-world data, advancing scientific ML toward bridging the sim-to-real gap and real-world deployment. Our benchmark, datasets, and instructions are available at https://realpdebench.github.io/.

翻译：预测复杂物理系统的演化仍然是科学与工程领域的核心问题。尽管科学机器学习模型取得了快速进展，但一个关键瓶颈在于缺乏昂贵的真实世界数据，导致当前大多数模型仅在模拟数据上进行训练和验证。这不仅限制了科学机器学习的发展与评估，也阻碍了对仿真到现实迁移等关键任务的研究。我们提出了RealPDEBench，这是首个将真实世界测量数据与配对数值模拟相结合的科学机器学习基准。RealPDEBench包含五个数据集、三项任务、八项评估指标和十个基线模型。我们首先呈现了涵盖不同复杂物理系统的五个真实世界测量数据集及其配对的模拟数据集。我们进一步定义了三项任务，这些任务支持真实世界数据与模拟数据之间的比较，并促进连接两者的方法开发。此外，我们设计了八项评估指标，涵盖数据导向和物理导向的度量标准，并最终对十个代表性基线模型进行了基准测试，包括最先进的模型、预训练的偏微分方程基础模型以及传统方法。实验揭示了模拟数据与真实世界数据之间的显著差异，同时表明使用模拟数据进行预训练能持续提升模型的准确性和收敛性。本研究旨在通过真实世界数据提供洞见，推动科学机器学习弥合仿真与现实之间的差距，并促进其在真实世界中的部署。我们的基准、数据集及使用指南发布于 https://realpdebench.github.io/。