Finding the sparsest solution to the underdetermined system $\mathbf{y}=\mathbf{Ax}$, given a tolerance, is known to be NP-hard. Many approximate solutions to this problem exist, and Orthogonal Matching Pursuit (OMP) is one of the most widely used. However, existing OMP implementations don't take full advantage of matrix properties or modern CPU and GPU-based Linear Algebra kernels. For this paper, we implemented an efficient implementation of OMP that leverages Cholesky inverse properties as well as the power of GPUs to deliver up to \textbf{310x speedup over Scikit-Learn} and \textbf{26x over SPAMS}. The package is published on PyPI (\texttt{pip install batched-omp}) and is fully scikit-learn compatible.
翻译:已知在给定容差下,寻找欠定系统$\mathbf{y}=\mathbf{Ax}$的最稀疏解是NP难问题。现有多种近似求解方法,其中正交匹配追踪(OMP)应用最为广泛。然而,现有OMP实现未能充分利用矩阵特性或现代CPU与GPU线性代数内核的优势。本文实现了一种高效的OMP算法,该算法利用Cholesky逆矩阵特性及GPU计算能力,相较于Scikit-Learn可实现\textbf{310倍加速},相较于SPAMS可实现\textbf{26倍加速}。该算法包已发布至PyPI(\texttt{pip install batched-omp})且完全兼容scikit-learn框架。