Operator learning for Partial Differential Equations (PDEs) is rapidly emerging as a promising approach for surrogate modeling of intricate systems. Transformers with the self-attention mechanism$\unicode{x2013}$a powerful tool originally designed for natural language processing$\unicode{x2013}$have recently been adapted for operator learning. However, they confront challenges, including high computational demands and limited interpretability. This raises a critical question: Is there a more efficient attention mechanism for Transformer-based operator learning? This paper proposes the Position-induced Transformer (PiT), built on an innovative position-attention mechanism, which demonstrates significant advantages over the classical self-attention in operator learning. Position-attention draws inspiration from numerical methods for PDEs. Different from self-attention, position-attention is induced by only the spatial interrelations of sampling positions for input functions of the operators, and does not rely on the input function values themselves, thereby greatly boosting efficiency. PiT exhibits superior performance over current state-of-the-art neural operators in a variety of complex operator learning tasks across diverse PDE benchmarks. Additionally, PiT possesses an enhanced discretization convergence feature, compared to the widely-used Fourier neural operator.
翻译:偏微分方程(PDE)的算子学习正迅速成为复杂系统替代建模的一种有前景的方法。基于自注意力机制的Transformer——这一最初为自然语言处理设计的强大工具——最近已被应用于算子学习。然而,它们面临挑战,包括高计算需求和有限的可解释性。这引发了一个关键问题:是否存在一种更高效的注意力机制用于基于Transformer的算子学习?本文提出位置诱导Transformer(PiT),它建立了一种创新的位置注意力机制之上,在算子学习中展现出相较于经典自注意力的显著优势。位置注意力从偏微分方程的数值方法中汲取灵感。与自注意力不同,位置注意力仅由算子输入函数采样点的空间相互关系所诱导,而不依赖于输入函数值本身,从而大幅提升了效率。在各类复杂算子学习任务中,PiT在多种PDE基准测试上均展现出优于当前最先进神经算子的性能。此外,与广泛使用的傅里叶神经算子相比,PiT具有增强的离散化收敛特性。