This paper investigates the spectral norm version of the column subset selection problem. Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a positive integer $k\leq\text{rank}(\mathbf{A})$, the objective is to select exactly $k$ columns of $\mathbf{A}$ that minimize the spectral norm of the residual matrix after projecting $\mathbf{A}$ onto the space spanned by the selected columns. We use the method of interlacing polynomials introduced by Marcus-Spielman-Srivastava to derive a new upper bound on the minimal approximation error. This new bound is asymptotically sharp when the matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ obeys a spectral power-law decay. The relevant expected characteristic polynomials can be written as an extension of the expected polynomial for the restricted invertibility problem, incorporating two extra variable substitution operators. Finally, we propose a deterministic polynomial-time algorithm that achieves this error bound up to a computational error.
翻译:本文研究谱范数意义下的列子集选择问题。给定矩阵 $\mathbf{A}\in\mathbb{R}^{n\times d}$ 和正整数 $k\leq\text{rank}(\mathbf{A})$,目标是从 $\mathbf{A}$ 中精确选取 $k$ 列,使得将 $\mathbf{A}$ 投影到所选列张成的空间后,残差矩阵的谱范数最小化。我们采用Marcus-Spielman-Srivastava提出的列交织多项式方法,推导出最小近似误差的新上界。当矩阵 $\mathbf{A}\in\mathbb{R}^{n\times d}$ 服从谱幂律衰减时,该上界渐近尖锐。相关期望特征多项式可视为受限可逆性问题期望多项式的推广,其中引入了两个额外的变量替换算子。最后,我们提出一种确定性的多项式时间算法,能在计算误差范围内实现该误差上界。