Solving partial differential equations (PDEs) is a fundamental problem in engineering and science. While neural PDE solvers can be more efficient than established numerical solvers, they often require large amounts of training data that is costly to obtain. Active Learning (AL) could help surrogate models reach the same accuracy with smaller training sets by querying classical solvers with more informative initial conditions and PDE parameters. While AL is more common in other domains, it has yet to be studied extensively for neural PDE solvers. To bridge this gap, we introduce AL4PDE, a modular and extensible active learning benchmark. It provides multiple parametric PDEs and state-of-the-art surrogate models for the solver-in-the-loop setting, enabling the evaluation of existing and the development of new AL methods for PDE solving. We use the benchmark to evaluate batch active learning algorithms such as uncertainty- and feature-based methods. We show that AL reduces the average error by up to 71% compared to random sampling and significantly reduces worst-case errors. Moreover, AL generates similar datasets across repeated runs, with consistent distributions over the PDE parameters and initial conditions. The acquired datasets are reusable, providing benefits for surrogate models not involved in the data generation.
翻译:求解偏微分方程是工程与科学领域的基础问题。虽然神经PDE求解器可能比传统数值求解器更高效,但它们通常需要大量训练数据,而这些数据的获取成本高昂。主动学习通过向经典求解器查询更具信息量的初始条件和PDE参数,可帮助代理模型以更小的训练集达到相同精度。尽管主动学习在其他领域更为常见,但针对神经PDE求解器的研究尚未广泛开展。为填补这一空白,我们提出了AL4PDE——一个模块化、可扩展的主动学习基准框架。该框架提供多种参数化PDE及最先进的代理模型,适用于求解器在环场景,能够评估现有方法并推动PDE求解领域新主动学习方法的发展。我们利用该基准评估了批量主动学习算法(如基于不确定性和基于特征的方法)。实验表明,与随机采样相比,主动学习将平均误差降低达71%,并显著减少了最坏情况误差。此外,主动学习在多次运行中生成的数据集具有相似性,其PDE参数与初始条件的分布保持一致性。所获得的数据集具备可复用性,能为未参与数据生成的代理模型提供增益。