Boundary-condition (BC) handling is a major source of complexity in PDE solvers on structured and block-structured grids, especially for high-order methods and distributed-memory execution. We present Mat2Boundary, a DSL and compiler for boundary computations that models a broad class of boundary-conditions as affine sparse linear operators. This abstraction unifies halo copying, circular and symmetric mappings, zero padding, block-edge synchronization, and user-defined interpolation, while exposing a modular basic sub-matrix interface for declarative composition. To make this representation efficient, Mat2Boundary combines multi-stage programming and polyhedral analysis to generate matrix-free kernels for structured cases, support user-defined sparse matrices for irregular cases, eliminate redundant boundary work, and synthesize reusable communication schedules for distributed execution. Evaluated on two shallow-water equation solvers on cubed-sphere grids and HPCG, Mat2Boundary achieves up to 7.6$\times$ BC-kernel speedup, reduces BC code by over 70%, and scales to 1,344 CPU cores with 72%-88% efficiency.
翻译:边界条件处理是结构化和块结构化网格PDE求解器中复杂性的主要来源,尤其对于高阶方法和分布式内存执行而言。我们提出Mat2Boundary,一种面向边界计算的领域特定语言与编译器,将广泛类别的边界条件建模为仿射稀疏线性算子。该抽象统一了晕影复制、循环与对称映射、零填充、块边缘同步以及用户自定义插值操作,同时提供模块化的基本子矩阵接口以实现声明式组合。为使该表示高效,Mat2Boundary结合多级编程与多面体分析:为结构化情形生成无矩阵内核,为非规则情形支持用户自定义稀疏矩阵,消除冗余边界计算,并为分布式执行合成可重用的通信调度。在立方球网格上的两个浅水方程求解器及HPCG上的评估表明,Mat2Boundary实现了高达7.6倍的边界核加速,减少超过70%的边界代码,并可在1,344个CPU核心上以72%-88%的效率扩展运行。