Local-search methods are widely employed in statistical applications, yet interestingly, their theoretical foundations remain rather underexplored, compared to other classes of estimators such as low-degree polynomials and spectral methods. Of note, among the few existing results recent studies have revealed a significant "local-computational" gap in the context of a well-studied sparse tensor principal component analysis (PCA), where a broad class of local Markov chain methods exhibits a notable underperformance relative to other polynomial-time algorithms. In this work, we propose a series of local-search methods that provably "close" this gap to the best known polynomial-time procedures in multiple regimes of the model, including and going beyond the previously studied regimes in which the broad family of local Markov chain methods underperforms. Our framework includes: (1) standard greedy and randomized greedy algorithms applied to the (regularized) posterior of the model; and (2) novel random-threshold variants, in which the randomized greedy algorithm accepts a proposed transition if and only if the corresponding change in the Hamiltonian exceeds a random Gaussian threshold-rather that if and only if it is positive, as is customary. The introduction of the random thresholds enables a tight mathematical analysis of the randomized greedy algorithm's trajectory by crucially breaking the dependencies between the iterations, and could be of independent interest to the community.
翻译:局部搜索方法在统计应用中已被广泛采用,然而有趣的是,与其他类型的估计器(如低次多项式方法和谱方法)相比,其理论基础仍相当缺乏探索。值得注意的是,在为数不多的现有结果中,近期研究揭示了一个在深入研究过的稀疏张量主成分分析(PCA)背景下的显著“局部计算”差距,其中一大类局部马尔可夫链方法相对于其他多项式时间算法表现出明显的性能不足。在本工作中,我们提出了一系列局部搜索方法,这些方法在模型的多个参数范围内,包括并超越了先前研究中局部马尔可夫链方法表现不佳的参数范围,理论上能够“弥合”这一差距,达到已知最佳多项式时间算法的性能水平。我们的框架包括:(1)应用于模型(正则化)后验的标准贪婪算法和随机贪婪算法;以及(2)新颖的随机阈值变体,其中随机贪婪算法接受一个提议的状态转移,当且仅当其对应的哈密顿量变化超过一个随机高斯阈值——而不是像通常做法那样仅当其变化为正时才接受。随机阈值的引入通过关键性地打破迭代之间的依赖性,使得对随机贪婪算法轨迹的严密数学分析成为可能,这一技术可能对该领域具有独立的学术价值。