We investigate the solution of low-rank matrix approximation problems using the truncated SVD. For this purpose, we develop and optimize GPU implementations for the randomized SVD and a blocked variant of the Lanczos approach. Our work takes advantage of the fact that the two methods are composed of very similar linear algebra building blocks, which can be assembled using numerical kernels from existing high-performance linear algebra libraries. Furthermore, the experiments with several sparse matrices arising in representative real-world applications and synthetic dense test matrices reveal a performance advantage of the block Lanczos algorithm when targeting the same approximation accuracy.
翻译:我们研究了利用截断奇异值分解求解低秩矩阵逼近问题。为此,我们针对随机奇异值分解和Lanczos方法的分块变体开发并优化了GPU实现。我们的工作利用了这两种方法均由非常相似的线性代数构建模块组成这一事实,这些模块可通过现有高性能线性代数库中的数值内核进行组装。此外,针对若干来自典型现实应用的稀疏矩阵和合成稠密测试矩阵的实验表明,在追求相同逼近精度时,分块Lanczos算法展现出性能优势。