We study nonconvex optimization in high dimensions through Langevin dynamics, focusing on the multi-spiked tensor PCA problem. This tensor estimation problem involves recovering $r$ hidden signal vectors (spikes) from noisy Gaussian tensor observations using maximum likelihood estimation. We study the number of samples required for Langevin dynamics to efficiently recover the spikes and determine the necessary separation condition on the signal-to-noise ratios (SNRs) for exact recovery, distinguishing the cases $p \ge 3$ and $p=2$, where $p$ denotes the order of the tensor. In particular, we show that the sample complexity required for recovering the spike associated with the largest SNR matches the well-known algorithmic threshold for the single-spike case, while this threshold degrades when recovering all $r$ spikes. As a key step, we provide a detailed characterization of the trajectory and interactions of low-dimensional projections that capture the high-dimensional dynamics.
翻译:本研究通过朗之万动力学探讨高维非凸优化问题,聚焦于多峰张量主成分分析问题。该张量估计问题旨在通过最大似然估计从含噪高斯张量观测中恢复r个隐藏信号向量(峰值)。我们系统分析了朗之万动力学有效恢复峰值所需的样本量,并确定了精确恢复所需信噪比(SNRs)的分离条件,区分了张量阶数p≥3与p=2两种情形。特别地,我们证明恢复最大信噪比对应峰值所需的样本复杂度与单峰情形下著名的算法阈值一致,而恢复全部r个峰值时该阈值会发生退化。作为关键步骤,我们深入刻画了捕捉高维动力学的低维投影轨迹及其相互作用机制。