In this paper, we investigate the optimal statistical performance and the impact of computational constraints for independent component analysis (ICA). Our goal is twofold. On the one hand, we characterize the precise role of dimensionality on sample complexity and statistical accuracy, and how computational consideration may affect them. In particular, we show that the optimal sample complexity is linear in dimensionality, and interestingly, the commonly used sample kurtosis-based approaches are necessarily suboptimal. However, the optimal sample complexity becomes quadratic, up to a logarithmic factor, in the dimension if we restrict ourselves to estimates that can be computed with low-degree polynomial algorithms. On the other hand, we develop computationally tractable estimates that attain both the optimal sample complexity and minimax optimal rates of convergence. We study the asymptotic properties of the proposed estimates and establish their asymptotic normality that can be readily used for statistical inferences. Our method is fairly easy to implement and numerical experiments are presented to further demonstrate its practical merits.
翻译:在本文中,我们研究了独立成分分析(ICA)的统计最优性能及计算约束的影响。我们的目标有两个方面:一方面,我们刻画了维度对样本复杂度与统计精度的精确作用,以及计算考量如何影响它们。特别地,我们证明了最优样本复杂度与维度呈线性关系,而有趣的是,常用的基于样本峰度的方法必然具有次优性。然而,若将估计量限制为可通过低阶多项式算法计算,则最优样本复杂度变为与维度的二次方(至多含对数因子)。另一方面,我们开发了既达到最优样本复杂度又达到极小化最优收敛率的计算可解估计量。我们研究了所提估计量的渐近性质,并建立了可直接用于统计推断的渐近正态性。该方法实现简便,并通过数值实验进一步展示了其实际优势。