Transformer-based neural operators have shown remarkable performance for approximating solution operators of partial differential equations on complex geometries. However, existing approaches implicitly assume a fixed domain size, which limits their ability to generalize at inference. In this work, we investigate domain extension, namely zero-shot inference on spatial domains that are significantly larger than those encountered during training. We argue that this setting fundamentally requires spatial locality and translation equivariance. We propose to implement this locality via a decomposable bias in the attention logits computation, enabling finely controllable locality while remaining fully decomposable into query-key inner products and directly compatible with optimized attention kernels. Combined with rotary positional embeddings, it enables expressive embeddings with controllable spatial support without altering the transformer architecture. We empirically show that our approach substantially improves zero-shot generalization to larger domains across two PDE benchmarks and a 3D industrial atmospheric flow application. Our code and datasets are available at https://github.com/cerea-daml/domain-extension.
翻译:基于Transformer的神经算子在复杂几何区域上逼近偏微分方程解算子方面展现了卓越性能。然而,现有方法隐式假设域尺寸固定,这限制了其在推理阶段的泛化能力。本研究聚焦域扩展问题,即在空间维度显著大于训练时的域上进行零样本推理。我们论证该场景本质要求空间局部性与平移等变性,并提出通过注意力对数计算中的可分解偏置实现局部性——该方法在保持完全可分解为查询-键内积的同时允许精细可控的局部性,且可直接兼容优化后的注意力核。结合旋转位置嵌入,该方法可在不改变Transformer架构的前提下实现具有可控空间支持范围的表达性嵌入。实验表明,本方法在两个偏微分方程基准测试及一个三维工业大气流动应用中显著提升了向更大域零样本泛化的能力。相关代码与数据集见https://github.com/cerea-daml/domain-extension。