Neural operators, such as Fourier Neural Operators (FNO), form a principled approach for learning solution operators for PDEs and other mappings between function spaces. However, many real-world problems require high-resolution training data, and the training time and limited GPU memory pose big barriers. One solution is to train neural operators in mixed precision to reduce the memory requirement and increase training speed. However, existing mixed-precision training techniques are designed for standard neural networks, and we find that their direct application to FNO leads to numerical overflow and poor memory efficiency. Further, at first glance, it may appear that mixed precision in FNO will lead to drastic accuracy degradation since reducing the precision of the Fourier transform yields poor results in classical numerical solvers. We show that this is not the case; in fact, we prove that reducing the precision in FNO still guarantees a good approximation bound, when done in a targeted manner. Specifically, we build on the intuition that neural operator learning inherently induces an approximation error, arising from discretizing the infinite-dimensional ground-truth input function, implying that training in full precision is not needed. We formalize this intuition by rigorously characterizing the approximation and precision errors of FNO and bounding these errors for general input functions. We prove that the precision error is asymptotically comparable to the approximation error. Based on this, we design a simple method to optimize the memory-intensive half-precision tensor contractions by greedily finding the optimal contraction order. Through extensive experiments on different state-of-the-art neural operators, datasets, and GPUs, we demonstrate that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
翻译:神经算子(如傅里叶神经算子FNO)为求解偏微分方程解算子及其他函数空间映射提供了一种原则性学习方法。然而,许多实际问题需要高分辨率训练数据,训练时间与有限的GPU内存构成重大障碍。一种解决方案是采用混合精度训练神经算子,以降低内存需求并提升训练速度。但现有混合精度训练技术专为标准神经网络设计,直接应用于FNO会导致数值溢出和低内存效率。此外,初看之下,降低FNO精度似乎会严重降低准确率,因为经典数值求解器中降低傅里叶变换精度会带来不良结果。我们证明事实并非如此:实际上,通过针对性策略降低FNO精度仍可保证良好的近似误差界。具体而言,我们基于以下直觉:神经算子学习本质上会产生由离散化无穷维真实输入函数导致的近似误差,因此无需全精度训练。我们通过严格刻画FNO的近似误差与精度误差,并针对一般输入函数给出其误差界,将这一直觉形式化。我们证明精度误差在渐近意义上与近似误差相当。基于此,我们设计了一种简单方法,通过贪心寻找最优约简顺序来优化内存密集型的半精度张量收缩。针对不同先进神经算子、数据集和GPU的广泛实验表明,我们的方法在几乎不降低精度的情况下,可将GPU内存使用量降低50%,吞吐量提升58%。