The interpolative decomposition (ID) aims to construct a low-rank approximation formed by a basis consisting of row/column skeletons in the original matrix and a corresponding interpolation matrix. This work explores fast and accurate ID algorithms from five essential perspectives for empirical performance: (a) skeleton complexity that measures the minimum possible ID rank for a given low-rank approximation error, (b) asymptotic complexity in FLOPs, (c) parallelizability of the computational bottleneck as matrix-matrix multiplications, (d) error-revealing property that enables automatic rank detection for given error tolerances without prior knowledge of target ranks, (e) ID-revealing property that ensures efficient construction of the optimal interpolation matrix after selecting the skeletons. While a broad spectrum of algorithms have been developed to optimize parts of the aforementioned perspectives, practical ID algorithms proficient in all perspectives remain absent. To fill in the gap, we introduce robust blockwise random pivoting (RBRP) that is parallelizable, error-revealing, and exactly ID-revealing, with comparable skeleton and asymptotic complexities to the best existing ID algorithms in practice. Through extensive numerical experiments on various synthetic and natural datasets, we demonstrate the appealing empirical performance of RBRP from the five perspectives above, as well as the robustness of RBRP to adversarial inputs.
翻译:插值分解旨在构建由原始矩阵中的行/列骨架基及相应插值矩阵构成的低秩近似。本文从经验性能的五个关键视角探索快速精确的插值分解算法:(a) 骨架复杂度——衡量给定低秩近似误差下可实现的最小ID秩;(b) 浮点运算渐进复杂度;(c) 计算瓶颈作为矩阵-矩阵乘法的可并行性;(d) 误差揭示特性——无需目标秩先验知识即可根据给定误差容限自动检测秩;(e) ID揭示特性——确保选择骨架后高效构建最优插值矩阵。尽管已有大量算法优化了上述部分视角,但尚无能全面精通所有视角的实用ID算法。为填补这一空白,我们提出鲁棒分块随机主元选择(RBRP),该算法具有可并行性、误差揭示性和精确ID揭示性,其骨架复杂度与渐进复杂度在实践中最优现有ID算法相当。通过在人造和自然数据集上的大量数值实验,我们从上述五个视角展示了RBRP优越的经验性能,以及其对对抗性输入的鲁棒性。