Tensor decomposition serves as a powerful primitive in statistics and machine learning, and has numerous applications in problems such as learning latent variable models or mixture of Gaussians. In this paper, we focus on using power iteration to decompose an overcomplete random tensor. Past work studying the properties of tensor power iteration either requires a non-trivial data-independent initialization, or is restricted to the undercomplete regime. Moreover, several papers implicitly suggest that logarithmically many iterations (in terms of the input dimension) are sufficient for the power method to recover one of the tensor components. Here we present a novel analysis of the dynamics of tensor power iteration from random initialization in the overcomplete regime, where the tensor rank is much greater than its dimension. Surprisingly, we show that polynomially many steps are necessary for convergence of tensor power iteration to any of the true component, which refutes the previous conjecture. On the other hand, our numerical experiments suggest that tensor power iteration successfully recovers tensor components for a broad range of parameters in polynomial time. To further complement our empirical evidence, we prove that a popular objective function for tensor decomposition is strictly increasing along the power iteration path. Our proof is based on the Gaussian conditioning technique, which has been applied to analyze the approximate message passing (AMP) algorithm. The major ingredient of our argument is a conditioning lemma that allows us to generalize AMP-type analysis to non-proportional limit and polynomially many iterations of the power method.
翻译:张量分解作为统计学与机器学习中的核心基础算子,在潜变量模型学习及高斯混合模型等众多问题中具有广泛应用。本文聚焦于使用幂迭代方法分解超完备随机张量。以往研究张量幂迭代性质的工作,或需要非平凡的数据无关初始化,或局限于欠完备情形。此外,多篇文献隐式指出幂方法仅需输入维度对数量级的迭代次数即可恢复张量分量。本文针对秩远大于维度的超完备情形,提出了随机初始化张量幂迭代动态过程的新颖分析。令人惊讶的是,我们证明张量幂迭代需要多项式量级的步骤才能收敛至任意真实分量,这推翻了先前的猜想。另一方面,数值实验表明张量幂迭代能在多项式时间内成功恢复广泛参数范围内的张量分量。为进一步佐证实验现象,我们证明张量分解的常用目标函数沿幂迭代路径严格单调递增。证明基于高斯条件化技术——该技术曾用于分析近似消息传递算法。论证的核心是条件化引理,该引理使我们能将AMP类分析推广至非比例极限情形及幂方法的多项式量级迭代过程。