A random $m\times n$ matrix $S$ is an oblivious subspace embedding (OSE) with parameters $\epsilon>0$, $\delta\in(0,1/3)$ and $d\leq m\leq n$, if for any $d$-dimensional subspace $W\subseteq R^n$, $P\big(\,\forall_{x\in W}\ (1+\epsilon)^{-1}\|x\|\leq\|Sx\|\leq (1+\epsilon)\|x\|\,\big)\geq 1-\delta.$ It is known that the embedding dimension of an OSE must satisfy $m\geq d$, and for any $\theta > 0$, a Gaussian embedding matrix with $m\geq (1+\theta) d$ is an OSE with $\epsilon = O_\theta(1)$. However, such optimal embedding dimension is not known for other embeddings. Of particular interest are sparse OSEs, having $s\ll m$ non-zeros per column, with applications to problems such as least squares regression and low-rank approximation. We show that, given any $\theta > 0$, an $m\times n$ random matrix $S$ with $m\geq (1+\theta)d$ consisting of randomly sparsified $\pm1/\sqrt s$ entries and having $s= O(\log^4(d))$ non-zeros per column, is an oblivious subspace embedding with $\epsilon = O_{\theta}(1)$. Our result addresses the main open question posed by Nelson and Nguyen (FOCS 2013), who conjectured that sparse OSEs can achieve $m=O(d)$ embedding dimension, and it improves on $m=O(d\log(d))$ shown by Cohen (SODA 2016). We use this to construct the first oblivious subspace embedding with $O(d)$ embedding dimension that can be applied faster than current matrix multiplication time, and to obtain an optimal single-pass algorithm for least squares regression. We further extend our results to construct even sparser non-oblivious embeddings, leading to the first subspace embedding with low distortion $\epsilon=o(1)$ and optimal embedding dimension $m=O(d/\epsilon^2)$ that can be applied in current matrix multiplication time.
翻译:设随机$m\times n$矩阵$S$为参数满足$\epsilon>0$,$\delta\in(0,1/3)$,$d\leq m\leq n$的无知子空间嵌入(OSE),若对任意$d$维子空间$W\subseteq R^n$,有$P\big(\,\forall_{x\in W}\ (1+\epsilon)^{-1}\|x\|\leq\|Sx\|\leq (1+\epsilon)\|x\|\,\big)\geq 1-\delta$。已知OSE的嵌入维度需满足$m\geq d$,且对任意$\theta > 0$,满足$m\geq (1+\theta) d$的高斯嵌入矩阵是参数$\epsilon = O_\theta(1)$的OSE。然而对于其他嵌入方法,这种最优嵌入维度尚未明确。特别值得关注的是稀疏OSE,其每列非零元个数$s\ll m$,可应用于最小二乘回归和低秩近似等问题。我们证明:对任意$\theta > 0$,由随机稀疏化$\pm1/\sqrt s$元素构成的$m\times n$随机矩阵$S$,当$m\geq (1+\theta)d$且每列非零元个数$s= O(\log^4(d))$时,该矩阵是参数$\epsilon = O_{\theta}(1)$的无知子空间嵌入。该结果解决了Nelson与Nguyen(FOCS 2013)提出的核心开放问题(他们曾猜测稀疏OSE可实现$m=O(d)$嵌入维度),并改进了Cohen(SODA 2016)证得的$m=O(d\log(d))$。我们利用该构造首次实现了嵌入维度为$O(d)$且应用速度快于当前矩阵乘法时间的无知子空间嵌入,并获得了最优单遍最小二乘回归算法。进一步拓展构造了更稀疏的非无知嵌入,首次实现了低失真$\epsilon=o(1)$与最优嵌入维度$m=O(d/\epsilon^2)$且可在当前矩阵乘法时间内应用的子空间嵌入。