Generalization of Higher Order Methods for Fast Iterative Matrix Inversion Suitable for GPU Acceleration

Recent technological developments have led to big data processing, which resulted in significant computational difficulties when solving large-scale linear systems or inverting matrices. As a result, fast approximate iterative matrix inversion methodologies via Graphical Processing Unit (GPU) acceleration has been a subject of extensive research, to find solutions where classic and direct inversion are too expensive to conduct. Some currently used methods are Neumann Series (NS), Newton iteration (NI), Chebyshev Iteration (CI), and Successive Over-Relaxation, to cite a few. In this work, we develop a new iterative algorithm based off the NS, which we named 'Nested Neumann' (NN). This new methodology generalizes higher orders of the NI (or CI), by taking advantage of a computationally free iterative update of the preconditioning matrix as a function of a given 'inception depth'. It has been mathematically demonstrated that the NN: (i) convergences given the preconditioning satisfies the spectral norm condition of the NS, (ii) has an order of rate of convergence has been shown to be equivalent to the order (inception depth plus one), and (iii) has an optimal inception depth is an inception depth of one or preferably two, depending on RAM constraints. Furthermore, we derive an explicit formula for the NN, which is applicable to massive sparse matrices, given an increase in computational cost. Importantly, the NN finds an analytic equivalancy statement between the NS and the the NN (NI, CI, and higher orders), which is of importance for mMIMO systems. Finally, the NN method is applicable positive semi-definite matrices for matrix inversion, and applicable to any linear system (sparse, non-sparse, complex, etc.).

翻译：近期的技术发展推动了大数据处理，这导致在求解大规模线性系统或矩阵求逆时面临显著的计算困难。因此，通过图形处理单元（GPU）加速的快速近似迭代矩阵求逆方法已成为广泛研究的课题，旨在寻找经典直接求逆方法成本过高时的解决方案。当前常用方法包括诺伊曼级数（NS）、牛顿迭代法（NI）、切比雪夫迭代法（CI）和逐次超松弛法等。本文基于NS开发了一种新的迭代算法，命名为"Nested Neumann"（NN）。该新方法通过利用预处理矩阵作为给定"初始深度"函数的计算自由迭代更新，推广了高阶NI（或CI）。数学证明表明，NN：（i）在预处理满足NS谱范数条件时收敛；（ii）其收敛阶数等效于（初始深度加一）阶；（iii）根据内存限制，最优初始深度为1或2。此外，我们推导了NN的显式公式，该公式适用于大规模稀疏矩阵，尽管计算成本有所增加。重要的是，NN建立了NS与NN（NI、CI及高阶方法）之间的分析等价关系，这对大规模MIMO系统具有重要意义。最后，NN方法适用于正半定矩阵的求逆，并可应用于任何线性系统（稀疏、非稀疏、复数等）。