The interpolative decomposition (ID) aims to construct a low-rank approximation formed by a basis consisting of row/column skeletons in the original matrix and a corresponding interpolation matrix. This work explores fast and accurate ID algorithms from five essential perspectives for empirical performance: (a) skeleton complexity that measures the minimum possible ID rank for a given low-rank approximation error, (b) asymptotic complexity in FLOPs, (c) parallelizability of the computational bottleneck as matrix-matrix multiplications, (d) error-revealing property that enables automatic rank detection for given error tolerances without prior knowledge of target ranks, (e) ID-revealing property that ensures efficient construction of the optimal interpolation matrix after selecting the skeletons. While a broad spectrum of algorithms have been developed to optimize parts of the aforementioned perspectives, practical ID algorithms proficient in all perspectives remain absent. To fill in the gap, we introduce robust blockwise random pivoting (RBRP) that is parallelizable, error-revealing, and exact-ID-revealing, with comparable skeleton and asymptotic complexities to the best existing ID algorithms in practice. Through extensive numerical experiments on various synthetic and natural datasets, we empirically demonstrate the appealing performance of RBRP from the five perspectives above, as well as the robustness of RBRP to adversarial inputs.
翻译:插值分解旨在构建由原始矩阵中行/列骨架基及相应插值矩阵组成的低秩近似。本文从五个关键实证性能维度探索快速且精确的插值分解算法:(a)骨架复杂度,度量给定低秩近似误差下的最小可能插值分解秩;(b)浮点运算次数表示的渐进复杂度;(c)将计算瓶颈转化为矩阵-矩阵相乘的可并行化能力;(d)无需目标秩先验知识即可针对给定误差容限自动检测秩的误差揭示特性;(e)确保在选定骨架后高效构建最优插值矩阵的插值分解揭示特性。尽管现有研究已开发出大量优化上述部分维度的算法,但目前仍缺乏在所有维度均表现优异的实用插值分解算法。为填补这一空白,我们提出鲁棒分块随机主元选择算法,该算法兼具可并行性、误差可揭示性及精确插值分解可揭示性,其骨架复杂度与渐进复杂度均达到现有最优插值分解算法的实践水平。通过在多种合成及真实数据集上的大量数值实验,我们从上述五个维度实证展示了鲁棒分块随机主元选择的优异性能,以及其对对抗性输入的鲁棒性。