The approximate minimum degree algorithm is widely used before numerical factorization to reduce fill-in for sparse matrices. While considerable attention has been given to the numerical factorization process, less focus has been placed on parallelizing the approximate minimum degree algorithm itself. In this paper, we explore different parallelization strategies, and introduce a novel parallel framework that leverages multiple elimination on distance-2 independent sets. Our evaluation shows that parallelism within individual elimination steps is limited due to low computational workload and significant memory contention. In contrast, our proposed framework overcomes these challenges by parallelizing the work across elimination steps. To the best of our knowledge, our implementation is the first scalable shared memory implementation of the approximate minimum degree algorithm. Experimental results show that we achieve up to a 7.29x speedup using 64 threads over the state-of-the-art sequential implementation in SuiteSparse.
翻译:近似最小度算法在数值分解前被广泛用于减少稀疏矩阵的填充。尽管数值分解过程已受到广泛关注,但针对近似最小度算法本身的并行化研究却相对较少。本文探讨了不同的并行化策略,并提出了一种新颖的并行框架,该框架利用距离-2独立集上的多重消元。我们的评估表明,由于计算负载较低且内存争用显著,单个消元步骤内的并行性受到限制。相比之下,我们提出的框架通过跨消元步骤并行化工作克服了这些挑战。据我们所知,我们的实现是近似最小度算法首个可扩展的共享内存实现。实验结果表明,在SuiteSparse中最先进的串行实现基础上,使用64线程我们最高可实现7.29倍的加速比。