Invariant subspaces and PCA in nearly matrix multiplication time

Approximating invariant subspaces of generalized eigenvalue problems (GEPs) is a fundamental computational problem at the core of machine learning and scientific computing. It is, for example, the root of Principal Component Analysis (PCA) for dimensionality reduction, data visualization, and noise filtering, and of Density Functional Theory (DFT), arguably the most popular method to calculate the electronic structure of materials. For a Hermitian definite GEP $HC=SC\Lambda$, let $\Pi_k$ be the true spectral projector on the invariant subspace that is associated with the $k$ smallest (or largest) eigenvalues. Given $H,$ $S$, an integer $k$, and accuracy $\varepsilon\in(0,1)$, we show that we can compute a matrix $\widetilde\Pi_k$ such that $\lVert\Pi_k-\widetilde\Pi_k\rVert_2\leq \varepsilon$, in $O\left( n^{\omega+\eta}\mathrm{polylog}(n,\varepsilon^{-1},\kappa(S),\mathrm{gap}_k^{-1}) \right)$ bit operations in the floating point model with probability $1-1/n$. Here, $\eta>0$ is arbitrarily small, $\omega\lesssim 2.372$ is the matrix multiplication exponent, $\kappa(S)=\lVert S\rVert_2\lVert S^{-1}\rVert_2$, and $\mathrm{gap}_k$ is the gap between eigenvalues $k$ and $k+1$. To the best of our knowledge, this is the first end-to-end analysis achieving such "forward-error" approximation guarantees with nearly $O(n^{\omega+\eta})$ bit complexity, improving classical $\widetilde O(n^3)$ eigensolvers, even for the regular case $(S=I)$. Our methods rely on a new $O(n^{\omega+\eta})$ stability analysis for the Cholesky factorization, and a new smoothed analysis for computing spectral gaps, which can be of independent interest. Ultimately, we obtain new matrix multiplication-type bit complexity upper bounds for PCA problems, including classical PCA and (randomized) low-rank approximation.

翻译：逼近广义特征值问题（GEP）的不变子空间是机器学习和科学计算领域的核心基础计算问题。例如，它是降维、数据可视化和噪声过滤中主成分分析（PCA）的根源，也是目前计算材料电子结构最流行的方法——密度泛函理论（DFT）的基础。对于一个厄米特正定广义特征值问题 $HC=SC\Lambda$，令 $\Pi_k$ 为与 $k$ 个最小（或最大）特征值相关联的不变子空间上的真实谱投影算子。给定 $H$、$S$、整数 $k$ 以及精度 $\varepsilon\in(0,1)$，我们证明可以在浮点模型中，以 $1-1/n$ 的概率，在 $O\left( n^{\omega+\eta}\mathrm{polylog}(n,\varepsilon^{-1},\kappa(S),\mathrm{gap}_k^{-1}) \right)$ 比特操作次数内，计算出一个矩阵 $\widetilde\Pi_k$，使得 $\lVert\Pi_k-\widetilde\Pi_k\rVert_2\leq \varepsilon$。其中，$\eta>0$ 是任意小的正数，$\omega\lesssim 2.372$ 是矩阵乘法指数，$\kappa(S)=\lVert S\rVert_2\lVert S^{-1}\rVert_2$，$\mathrm{gap}_k$ 是第 $k$ 个与第 $k+1$ 个特征值之间的间隙。据我们所知，这是首个实现此类“前向误差”逼近保证且具有近乎 $O(n^{\omega+\eta})$ 比特复杂度的端到端分析，即使对于常规情况 $(S=I)$，也改进了经典的 $\widetilde O(n^3)$ 特征求解器。我们的方法依赖于对 Cholesky 分解稳定性的新 $O(n^{\omega+\eta})$ 分析，以及计算谱间隙的新平滑分析，这些分析可能具有独立的研究价值。最终，我们为 PCA 问题（包括经典 PCA 和（随机化）低秩逼近）获得了新的矩阵乘法型比特复杂度上界。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

牛津大学最新《计算代数拓扑》笔记书，107页pdf

专知会员服务

44+阅读 · 2022年2月17日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日