The spectral clustering algorithm is often used as a binary clustering method for unclassified data by applying the principal component analysis. To study theoretical properties of the algorithm, the assumption of homoscedasticity is often supposed in existing studies. However, this assumption is restrictive and often unrealistic in practice. Therefore, in this paper, we consider the allometric extension model, that is, the directions of the first eigenvectors of two covariance matrices and the direction of the difference of two mean vectors coincide, and we provide a non-asymptotic bound of the error probability of the spectral clustering algorithm for the allometric extension model. As a byproduct of the result, we obtain the consistency of the clustering method in high-dimensional settings.
翻译:谱聚类算法常通过主成分分析作为无分类数据的二元聚类方法。为研究该算法的理论性质,现有研究通常假设同方差性。然而,这一假设具有局限性且在实际中常不成立。因此,本文考虑异速扩展模型,即两个协方差矩阵的第一特征向量方向与两均值向量之差的方向一致,并给出了该模型下谱聚类算法误差概率的非渐近上界。作为该结果的推论,我们得到了高维场景下聚类方法的一致性。