When rows of an $n \times d$ matrix $A$ are given in a stream, we study algorithms for approximating the top eigenvector of the matrix ${A}^TA$ (equivalently, the top right singular vector of $A$). We consider worst case inputs $A$ but assume that the rows are presented to the streaming algorithm in a uniformly random order. We show that when the gap parameter $R = \sigma_1(A)^2/\sigma_2(A)^2 = \Omega(1)$, then there is a randomized algorithm that uses $O(h \cdot d \cdot \operatorname{polylog}(d))$ bits of space and outputs a unit vector $v$ that has a correlation $1 - O(1/\sqrt{R})$ with the top eigenvector $v_1$. Here $h$ denotes the number of \emph{heavy rows} in the matrix, defined as the rows with Euclidean norm at least $\|{A}\|_F/\sqrt{d \cdot \operatorname{polylog}(d)}$. We also provide a lower bound showing that any algorithm using $O(hd/R)$ bits of space can obtain at most $1 - \Omega(1/R^2)$ correlation with the top eigenvector. Thus, parameterizing the space complexity in terms of the number of heavy rows is necessary for high accuracy solutions. Our results improve upon the $R = \Omega(\log n \cdot \log d)$ requirement in a recent work of Price and Xun (FOCS 2024). We note that the algorithm of Price and Xun works for arbitrary order streams whereas our algorithm requires a stronger assumption that the rows are presented in a uniformly random order. We additionally show that the gap requirements in their analysis can be brought down to $R = \Omega(\log^2 d)$ for arbitrary order streams and $R = \Omega(\log d)$ for random order streams. The requirement of $R = \Omega(\log d)$ for random order streams is nearly tight for their analysis as we obtain a simple instance with $R = \Omega(\log d/\log\log d)$ for which their algorithm, with any fixed learning rate, cannot output a vector approximating the top eigenvector $v_1$.
翻译:当矩阵 $A$(维度为 $n \times d$)的行以流式方式给出时,我们研究近似计算矩阵 ${A}^TA$ 的主特征向量(等价于 $A$ 的右主奇异向量)的算法。我们考虑最坏情况下的输入矩阵 $A$,但假设行以均匀随机顺序呈现给流式算法。我们证明,当间隙参数 $R = \sigma_1(A)^2/\sigma_2(A)^2 = \Omega(1)$ 时,存在一种随机化算法,其使用 $O(h \cdot d \cdot \operatorname{polylog}(d))$ 比特的存储空间,并输出一个单位向量 $v$,该向量与主特征向量 $v_1$ 的相关性为 $1 - O(1/\sqrt{R})$。此处 $h$ 表示矩阵中\emph{重行}的数量,定义为欧几里得范数至少为 $\|{A}\|_F/\sqrt{d \cdot \operatorname{polylog}(d)}$ 的行。我们还提供了一个下界,表明任何使用 $O(hd/R)$ 比特存储空间的算法最多只能获得 $1 - \Omega(1/R^2)$ 的相关性。因此,为了获得高精度解,将空间复杂度参数化为重行数量是必要的。我们的结果改进了 Price 和 Xun(FOCS 2024)近期工作中 $R = \Omega(\log n \cdot \log d)$ 的要求。我们注意到,Price 和 Xun 的算法适用于任意顺序流,而我们的算法需要更强的假设,即行以均匀随机顺序呈现。我们还进一步证明,对于任意顺序流,他们分析中的间隙要求可以降低到 $R = \Omega(\log^2 d)$;对于随机顺序流,可以降低到 $R = \Omega(\log d)$。对于随机顺序流,$R = \Omega(\log d)$ 的要求在他们的分析中几乎是紧的,因为我们构造了一个简单的实例,其中 $R = \Omega(\log d/\log\log d)$,对于该实例,他们的算法(使用任何固定的学习率)无法输出近似主特征向量 $v_1$ 的向量。