Finding the sparsest solution to the underdetermined system $\mathbf{y}=\mathbf{Ax}$, given a tolerance, is known to be NP-hard. Many approximate solutions to this problem exist, and Orthogonal Matching Pursuit (OMP) is one of the most widely used. However, existing OMP implementations don't take full advantage of matrix properties or modern CPU and GPU-based Linear Algebra kernels. For this paper, we implemented an efficient implementation of OMP that leverages Cholesky inverse properties as well as the power of GPUs to deliver up to \textbf{310x speedup over Scikit-Learn} and \textbf{26x over SPAMS}. The package is published on PyPI (\texttt{pip install batched-omp}) and is fully scikit-learn compatible.
翻译:暂无翻译