We present a fast sparse matrix permutation algorithm tailored to linear systems arising from triangle meshes. Our approach produces nested-dissection-style permutations while significantly reducing permutation runtime overhead. Rather than enforcing strict balance and separator optimality, the algorithm deliberately relaxes these design decisions to favor fast partitioning and efficient elimination-tree construction. Our method decomposes permutation into patch-level local orderings and a compact quotient-graph ordering of separators, preserving the essential structure required by sparse Cholesky factorization while avoiding its most expensive components. We integrate our algorithm into vendor-maintained sparse Cholesky solvers on both CPUs and GPUs. Across a range of graphics applications, including single factorizations and repeated factorizations, our method reduces permutation time and improves the sparse Cholesky solve performance by up to 6.27x. Our code is available at https://github.com/BehroozZare/fast-permute.
翻译:我们提出了一种快速稀疏矩阵置换算法,专门针对三角网格产生的线性系统。该方法能够生成嵌套剖分风格的置换,同时显著降低置换运行时的开销。该算法不追求严格的平衡性和分隔符最优性,而是有意识地放宽这些设计决策,以优先实现快速分区和高效消元树构建。我们的方法将置换分解为面片级别的局部排序和分隔符的紧凑商图排序,在保留稀疏Cholesky分解所需核心结构的同时,避免了其最昂贵的组成部分。我们将该算法集成到CPU和GPU上由供应商维护的稀疏Cholesky求解器中。在包括单次分解和重复分解在内的多种图形学应用中,我们的方法减少了置换时间,并将稀疏Cholesky求解性能提升了最多6.27倍。我们的代码可在 https://github.com/BehroozZare/fast-permute 获取。