Efficient solutions of large-scale, ill-conditioned and indefinite algebraic equations are ubiquitously needed in numerous computational fields, including multiphysics simulations, machine learning, and data science. Because of their robustness and accuracy, direct solvers are crucial components in building a scalable solver toolchain. In this chapter, we will review recent advances of sparse direct solvers along two axes: 1) reducing communication and latency costs in both task- and data-parallel settings, and 2) reducing computational complexity via low-rank and other compression techniques such as hierarchical matrix algebra. In addition to algorithmic principles, we also illustrate the key parallelization challenges and best practices to deliver high speed and reliability on modern heterogeneous parallel machines.
翻译:在大规模、病态且不定的代数方程组高效求解问题上,多物理场模拟、机器学习及数据科学等众多计算领域普遍存在需求。由于直接求解器具有稳健性和精确性,它们是构建可扩展求解器工具链的关键组件。本章将沿两条主线梳理稀疏直接求解器的最新进展:1) 在任务并行与数据并行两种场景下降低通信与延迟成本;2) 通过低秩压缩及分层矩阵代数等压缩技术降低计算复杂度。除算法原理外,我们还阐释了在当代异构并行机器上实现高速度与高可靠性的关键并行化挑战及最佳实践。