We enhance Fan et al.'s (2019) one-round distributed principal component analysis algorithm by adding a second fixed-point iteration round. Random matrix theory reveals the one-round estimator exhibits higher asymptotic error than the pooling estimator under moderate local signal-to-noise ratios. Remarkably, our second iteration round eliminates this efficiency gap. It follows from a careful analysis of the first-order perturbation of eigenspaces. Empirical experiments on synthetic and benchmark datasets consistently demonstrate the two-round method's statistical advantage over the one-round approach.
翻译:我们通过增加第二轮不动点迭代,改进了Fan等人(2019)的单轮分布式主成分分析算法。随机矩阵理论表明,在中等局部信噪比条件下,单轮估计量比集中式估计量具有更高的渐近误差。值得注意的是,我们的第二轮迭代消除了这一效率差距。该结论源于对特征空间一阶扰动的精细分析。在合成数据集与基准数据集上的实证实验一致表明,两轮方法在统计性能上优于单轮方法。