Statistical-Computational Trade-offs in Tensor PCA and Related Problems via Communication Complexity

Tensor PCA is a stylized statistical inference problem introduced by Montanari and Richard to study the computational difficulty of estimating an unknown parameter from higher-order moment tensors. Unlike its matrix counterpart, Tensor PCA exhibits a statistical-computational gap, i.e., a sample size regime where the problem is information-theoretically solvable but conjectured to be computationally hard. This paper derives computational lower bounds on the run-time of memory bounded algorithms for Tensor PCA using communication complexity. These lower bounds specify a trade-off among the number of passes through the data sample, the sample size, and the memory required by any algorithm that successfully solves Tensor PCA. While the lower bounds do not rule out polynomial-time algorithms, they do imply that many commonly-used algorithms, such as gradient descent and power method, must have a higher iteration count when the sample size is not large enough. Similar lower bounds are obtained for Non-Gaussian Component Analysis, a family of statistical estimation problems in which low-order moment tensors carry no information about the unknown parameter. Finally, stronger lower bounds are obtained for an asymmetric variant of Tensor PCA and related statistical estimation problems. These results explain why many estimators for these problems use a memory state that is significantly larger than the effective dimensionality of the parameter of interest.

翻译：张量PCA是由Montanari和Richard提出的一种典型统计推断问题，旨在研究从高阶矩张量中估计未知参数的计算难度。与矩阵PCA不同，张量PCA存在统计-计算差距，即存在一个样本量区间，在该区间内问题在信息论上可解，但被推测在计算上困难。本文利用通信复杂度推导了张量PCA中受内存限制算法的运行时间复杂度下界。这些下界揭示了任何成功求解张量PCA的算法在数据样本遍历次数、样本量与所需内存之间的权衡关系。虽然下界并未排除多项式时间算法的存在，但表明当样本量不足时，梯度下降法和幂法等多种常用算法必须增加迭代次数。本文进一步为"非高斯成分分析"（一类低阶矩张量不携带未知参数信息的统计估计问题）建立了类似下界。最终，针对张量PCA的非对称变体及相关统计估计问题，获得了更强的下界。这些结果解释了为何针对此类问题的众多估计器采用的内存状态规模远超参数有效维度。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日