The Fourier neural operator (FNO) is a powerful technique for learning surrogate maps for partial differential equation (PDE) solution operators. For many real-world applications, which often require high-resolution data points, training time and memory usage are significant bottlenecks. While there are mixed-precision training techniques for standard neural networks, those work for real-valued datatypes on finite dimensions and therefore cannot be directly applied to FNO, which crucially operates in the (complex-valued) Fourier domain and in function spaces. On the other hand, since the Fourier transform is already an approximation (due to discretization error), we do not need to perform the operation at full precision. In this work, we (i) profile memory and runtime for FNO with full and mixed-precision training, (ii) conduct a study on the numerical stability of mixed-precision training of FNO, and (iii) devise a training routine which substantially decreases training time and memory usage (up to 34%), with little or no reduction in accuracy, on the Navier-Stokes and Darcy flow equations. Combined with the recently proposed tensorized FNO (Kossaifi et al., 2023), the resulting model has far better performance while also being significantly faster than the original FNO.
翻译:傅里叶神经算子(FNO)是一种学习偏微分方程求解算子替代映射的强大技术。对于许多需要高分辨率数据点的实际应用而言,训练时间和内存使用是主要的性能瓶颈。虽然针对标准神经网络存在混合精度训练技术,但这些技术适用于有限维度的实值数据类型,因此无法直接应用于FNO——因为FNO的核心操作是在(复值)傅里叶域和函数空间中进行的。另一方面,由于傅里叶变换本身已因离散化误差而存在近似性,我们无需以全精度执行该操作。本研究:(i) 分析了FNO在全精度与混合精度训练下的内存占用及运行时间;(ii) 开展了FNO混合精度训练的数值稳定性研究;(iii) 设计了一套训练流程,在纳维-斯托克斯方程和达西流动方程上,将训练时间与内存占用显著降低(最高达34%),同时几乎不损失精度。结合近期提出的张量化FNO(Kossaifi等人,2023),所得模型不仅性能大幅提升,其运行速度也显著优于原始FNO。