Sparse matrices and linear algebra are at the heart of scientific simulations. Over the years, more than 70 sparse matrix storage formats have been developed, targeting a wide range of hardware architectures and matrix types, each of which exploit the particular strengths of an architecture, or the specific sparsity patterns of the matrices. In this work, we explore the suitability of storage formats such as COO, CSR and DIA for emerging architectures such as AArch64 CPUs and FPGAs. In addition, we detail hardware-specific optimisations to these targets and evaluate the potential of each contribution to be integrated into Morpheus, a modern library that provides an abstraction of sparse matrices (currently) across x86 CPUs and NVIDIA/AMD GPUs. Finally, we validate our work by comparing the performance of the Morpheus-enabled HPCG benchmark against vendor-optimised implementations.
翻译:稀疏矩阵与线性代数是科学计算的核心。多年来,针对多种硬件架构和矩阵类型,已有超过70种稀疏矩阵存储格式被开发,每种格式都利用了特定架构的优势或矩阵的特定稀疏模式。本文探索了COO、CSR和DIA等存储格式对新兴架构(如AArch64 CPU和FPGA)的适用性。此外,我们详细阐述了针对这些目标的硬件特定优化,并评估了各项成果集成到Morpheus(一个提供稀疏矩阵抽象、当前支持x86 CPU和NVIDIA/AMD GPU的现代库)中的潜力。最后,通过将启用Morpheus的HPCG基准测试与供应商优化实现的性能进行对比,验证了我们的工作。