On the Unreasonable Effectiveness of Single Vector Krylov Methods for Low-Rank Approximation

Krylov subspace methods are a ubiquitous tool for computing near-optimal rank $k$ approximations of large matrices. While "large block" Krylov methods with block size at least $k$ give the best known theoretical guarantees, block size one (a single vector) or a small constant is often preferred in practice. Despite their popularity, we lack theoretical bounds on the performance of such "small block" Krylov methods for low-rank approximation. We address this gap between theory and practice by proving that small block Krylov methods essentially match all known low-rank approximation guarantees for large block methods. Via a black-box reduction we show, for example, that the standard single vector Krylov method run for $t$ iterations obtains the same spectral norm and Frobenius norm error bounds as a Krylov method with block size $\ell \geq k$ run for $O(t/\ell)$ iterations, up to a logarithmic dependence on the smallest gap between sequential singular values. That is, for a given number of matrix-vector products, single vector methods are essentially as effective as any choice of large block size. By combining our result with tail-bounds on eigenvalue gaps in random matrices, we prove that the dependence on the smallest singular value gap can be eliminated if the input matrix is perturbed by a small random matrix. Further, we show that single vector methods match the more complex algorithm of [Bakshi et al. `22], which combines the results of multiple block sizes to achieve an improved algorithm for Schatten $p$-norm low-rank approximation.

翻译：Krylov子空间方法是计算大矩阵近最优秩$k$逼近的通用工具。虽然块大小至少为$k$的“大块”Krylov方法具有最佳已知理论保证，但在实践中通常优先采用块大小为1（单向量）或小常数的Krylov方法。尽管这些方法应用广泛，但关于此类“小块”Krylov方法在低秩逼近中的性能仍缺乏理论界限。我们通过证明小块Krylov方法本质上能够匹配大块方法的所有已知低秩逼近保证，弥合了理论与实践之间的这一差距。通过黑箱归约，我们证明，例如，运行$t$次迭代的标准单向量Krylov方法所获得的谱范数和Frobenius范数误差界，与块大小为$\ell \geq k$的Krylov方法运行$O(t/\ell)$次迭代所获得的误差界相同，仅与顺序奇异值之间最小间隔的对数相关。也就是说，对于给定数量的矩阵-向量乘积，单向量方法与任意大块选择的方法在效果上基本等价。结合我们对随机矩阵中特征值间隙尾部的分析结果，我们证明若输入矩阵被一个小随机矩阵扰动，则对最小奇异值间隙的依赖性可以被消除。此外，我们证明单向量方法能够匹配[Bakshi等人'22]中更复杂的算法，该算法通过结合多种块大小的结果，实现了改进的Schatten $p$-范数低秩逼近算法。