Many problems in high-dimensional statistics appear to have a statistical-computational gap: a range of values of the signal-to-noise ratio where inference is information-theoretically possible, but (conjecturally) computationally intractable. A canonical such problem is Tensor PCA, where we observe a tensor $Y$ consisting of a rank-one signal plus Gaussian noise. Multiple lines of work suggest that Tensor PCA becomes computationally hard at a critical value of the signal's magnitude. In particular, below this transition, no low-degree polynomial algorithm can detect the signal with high probability; conversely, various spectral algorithms are known to succeed above this transition. We unify and extend this work by considering tensor networks, orthogonally invariant polynomials where multiple copies of $Y$ are "contracted" to produce scalars, vectors, matrices, or other tensors. We define a new set of objects, tensor cumulants, which provide an explicit, near-orthogonal basis for invariant polynomials of a given degree. This basis lets us unify and strengthen previous results on low-degree hardness, giving a combinatorial explanation of the hardness transition and of a continuum of subexponential-time algorithms that work below it, and proving tight lower bounds against low-degree polynomials for recovering rather than just detecting the signal. It also lets us analyze a new problem of distinguishing between different tensor ensembles, such as Wigner and Wishart tensors, establishing a sharp computational threshold and giving evidence of a new statistical-computational gap in the Central Limit Theorem for random tensors. Finally, we believe these cumulants are valuable mathematical objects in their own right: they generalize the free cumulants of free probability theory from matrices to tensors, and share many of their properties, including additivity under additive free convolution.
翻译:高维统计学中的许多问题似乎存在统计-计算差距:在一段信噪比范围内,推断在信息论上是可能的,但(推测)在计算上却难以处理。一个典型的问题是张量PCA,我们观察到一个张量$Y$,它由秩一信号加上高斯噪声组成。多条研究线索表明,张量PCA在信号强度的临界值处变得计算困难。具体而言,低于此转变点时,没有低次多项式算法能够高概率检测到信号;反之,已知多种谱算法在此转变点以上能够成功。我们通过考虑张量网络(即正交不变多项式,其中$Y$的多个副本被“收缩”以生成标量、向量、矩阵或其他张量)来统一并扩展这些工作。我们定义了一组新的对象——张量累积量,它为给定次数的不变多项式提供了一个显式的、近乎正交的基。该基使我们能够统一并加强先前关于低次困难性的结果,给出了计算困难性转变以及低于该转变点的一系列次指数时间算法的组合解释,并证明了针对信号恢复(而非仅仅检测)的低次多项式的紧下界。它还使我们能够分析区分不同张量系综(如Wigner和Wishart张量)的新问题,确立了锐利的计算阈值,并为随机张量中心极限定理中的新统计-计算差距提供了证据。最后,我们相信这些累积量本身即是有价值的数学对象:它们将自由概率理论中的自由累积量从矩阵推广到张量,并继承了其许多性质,包括在加法自由卷积下的可加性。