Linear solvers are major computational bottlenecks in a wide range of decision support and optimization computations. The challenges become even more pronounced on heterogeneous hardware, where traditional sparse numerical linear algebra methods are often inefficient. For example, methods for solving ill-conditioned linear systems have relied on conditional branching, which degrades performance on hardware accelerators such as graphical processing units (GPUs). To improve the efficiency of solving ill-conditioned systems, our computational strategy separates computations that are efficient on GPUs from those that need to run on traditional central processing units (CPUs). Our strategy maximizes the reuse of expensive CPU computations. Iterative methods, which thus far have not been broadly used for ill-conditioned linear systems, play an important role in our approach. In particular, we extend ideas from [1] to implement iterative refinement using inexact LU factors and flexible generalized minimal residual (FGMRES), with the aim of efficient performance on GPUs. We focus on solutions that are effective within broader application contexts, and discuss how early performance tests could be improved to be more predictive of the performance in a realistic environment
翻译:线性求解器是众多决策支持与优化计算中的主要计算瓶颈。在异构硬件上,传统稀疏数值线性代数方法往往效率低下,这使得挑战更为突出。例如,求解病态线性系统的方法依赖条件分支,这会降低图形处理单元(GPU)等硬件加速器的性能。为提高病态系统求解效率,我们的计算策略将GPU上高效的计算与需在传统中央处理单元(CPU)上运行的计算分离,并最大化对昂贵CPU计算的重用。目前尚未广泛用于病态线性系统的迭代方法,在我们的方法中发挥了重要作用。具体而言,我们扩展了文献[1]的思想,利用非精确LU分解因子和灵活广义最小残差法(FGMRES)实现迭代精化,旨在GPU上获得高效性能。我们关注在更广泛的应用场景中有效的解决方案,并讨论了如何改进早期性能测试,使其更能预测实际环境中的性能表现。