Matrix-vector multiplication forms the basis of many iterative solution algorithms and as such is an important algorithm also for hierarchical matrices which are used to represent dense data in an optimized form by applying low-rank compression. However, due to its low computational intensity, the performance of matrix-vector multiplication is typically limited by the available memory bandwidth on parallel systems. With floating point compression the memory footprint can be optimized, which reduces the stress on the memory sub system and thereby increases performance. We will look into the compression of different formats of hierachical matrices and how this can be used to speed up the corresponding matrix-vector multiplication.
翻译:矩阵向量乘法构成了许多迭代求解算法的基础,因此对于分层矩阵而言,它同样是一种重要算法。分层矩阵通过应用低秩压缩,以优化形式表示稠密数据。然而,由于计算强度较低,矩阵向量乘法的性能通常受限于并行系统上的可用内存带宽。通过浮点压缩,可以优化内存占用,从而减轻内存子系统的压力并提升性能。本文将探讨不同分层矩阵格式的压缩方法,以及如何利用这些方法来加速相应的矩阵向量乘法。