Approximating the Top Eigenvector in Random Order Streams

When rows of an $n \times d$ matrix $A$ are given in a stream, we study algorithms for approximating the top eigenvector of the matrix ${A}^TA$ (equivalently, the top right singular vector of $A$). We consider worst case inputs $A$ but assume that the rows are presented to the streaming algorithm in a uniformly random order. We show that when the gap parameter $R = \sigma_1(A)^2/\sigma_2(A)^2 = \Omega(1)$, then there is a randomized algorithm that uses $O(h \cdot d \cdot \operatorname{polylog}(d))$ bits of space and outputs a unit vector $v$ that has a correlation $1 - O(1/\sqrt{R})$ with the top eigenvector $v_1$. Here $h$ denotes the number of \emph{heavy rows} in the matrix, defined as the rows with Euclidean norm at least $\|{A}\|_F/\sqrt{d \cdot \operatorname{polylog}(d)}$. We also provide a lower bound showing that any algorithm using $O(hd/R)$ bits of space can obtain at most $1 - \Omega(1/R^2)$ correlation with the top eigenvector. Thus, parameterizing the space complexity in terms of the number of heavy rows is necessary for high accuracy solutions. Our results improve upon the $R = \Omega(\log n \cdot \log d)$ requirement in a recent work of Price and Xun (FOCS 2024). We note that the algorithm of Price and Xun works for arbitrary order streams whereas our algorithm requires a stronger assumption that the rows are presented in a uniformly random order. We additionally show that the gap requirements in their analysis can be brought down to $R = \Omega(\log^2 d)$ for arbitrary order streams and $R = \Omega(\log d)$ for random order streams. The requirement of $R = \Omega(\log d)$ for random order streams is nearly tight for their analysis as we obtain a simple instance with $R = \Omega(\log d/\log\log d)$ for which their algorithm, with any fixed learning rate, cannot output a vector approximating the top eigenvector $v_1$.

翻译：当矩阵 $A$（维度为 $n \times d$）的行以流式方式给出时，我们研究近似计算矩阵 ${A}^TA$ 的主特征向量（等价于 $A$ 的右主奇异向量）的算法。我们考虑最坏情况下的输入矩阵 $A$，但假设行以均匀随机顺序呈现给流式算法。我们证明，当间隙参数 $R = \sigma_1(A)^2/\sigma_2(A)^2 = \Omega(1)$ 时，存在一种随机化算法，其使用 $O(h \cdot d \cdot \operatorname{polylog}(d))$ 比特的存储空间，并输出一个单位向量 $v$，该向量与主特征向量 $v_1$ 的相关性为 $1 - O(1/\sqrt{R})$。此处 $h$ 表示矩阵中\emph{重行}的数量，定义为欧几里得范数至少为 $\|{A}\|_F/\sqrt{d \cdot \operatorname{polylog}(d)}$ 的行。我们还提供了一个下界，表明任何使用 $O(hd/R)$ 比特存储空间的算法最多只能获得 $1 - \Omega(1/R^2)$ 的相关性。因此，为了获得高精度解，将空间复杂度参数化为重行数量是必要的。我们的结果改进了 Price 和 Xun（FOCS 2024）近期工作中 $R = \Omega(\log n \cdot \log d)$ 的要求。我们注意到，Price 和 Xun 的算法适用于任意顺序流，而我们的算法需要更强的假设，即行以均匀随机顺序呈现。我们还进一步证明，对于任意顺序流，他们分析中的间隙要求可以降低到 $R = \Omega(\log^2 d)$；对于随机顺序流，可以降低到 $R = \Omega(\log d)$。对于随机顺序流，$R = \Omega(\log d)$ 的要求在他们的分析中几乎是紧的，因为我们构造了一个简单的实例，其中 $R = \Omega(\log d/\log\log d)$，对于该实例，他们的算法（使用任何固定的学习率）无法输出近似主特征向量 $v_1$ 的向量。