Iterative solvers are frequently used in scientific applications and engineering computations. However, the memory-bound Sparse Matrix-Vector (SpMV) kernel computation hinders the efficiency of iterative algorithms. As modern hardware increasingly supports low-precision computation, the mixed-precision optimization of iterative algorithms has garnered widespread attention. Nevertheless, existing mixed-precision methods pose challenges, including format conversion overhead, tight coupling between storage and computation representation, and the need to store multiple precision copies of data. This paper proposes a floating-point representation based on the group-shared exponent and segmented storage of the mantissa, enabling higher bit utilization of the representation vector and fast switches between different precisions without needing multiple data copies. Furthermore, a stepped mixed-precision iterative algorithm is proposed. Our experimental results demonstrate that, compared with existing floating-point formats, our approach significantly improves iterative algorithms' performance and convergence residuals.
翻译:迭代求解器在科学应用和工程计算中被广泛使用。然而,受限于内存带宽的稀疏矩阵-向量(SpMV)核计算阻碍了迭代算法的效率。随着现代硬件日益支持低精度计算,迭代算法的混合精度优化已引起广泛关注。然而,现有的混合精度方法面临诸多挑战,包括格式转换开销、存储与计算表示之间的紧耦合,以及需要存储数据的多个精度副本。本文提出了一种基于分组共享指数和尾数分段存储的浮点数表示方法,能够提高表示向量的比特利用率,并实现不同精度间的快速切换,而无需存储多个数据副本。此外,本文还提出了一种阶梯式混合精度迭代算法。实验结果表明,与现有的浮点格式相比,我们的方法显著提升了迭代算法的性能和收敛残差。