The recently emerged spectral clustering surpasses conventional clustering methods by detecting clusters of any shape without the convexity assumption. Unfortunately, with a computational complexity of $O(n^3)$, it was infeasible for multiple real applications, where $n$ could be large. This stimulates researchers to propose the approximate spectral clustering (ASC). However, most of ASC methods assumed that the number of clusters $k$ was known. In practice, manual setting of $k$ could be subjective or time consuming. The proposed algorithm has two relevance metrics for estimating $k$ in two vital steps of ASC. One for selecting the eigenvectors spanning the embedding space, and the other to discover the number of clusters in that space. The algorithm used a growing neural gas (GNG) approximation, GNG is superior in preserving input data topology. The experimental setup demonstrates the efficiency of the proposed algorithm and its ability to compete with similar methods where $k$ was set manually.
翻译:近期兴起的谱聚类通过检测任意形状的聚类簇(无需凸性假设)超越了传统聚类方法。然而,由于计算复杂度为$O(n^3)$,该方法在$n$可能较大的实际应用中难以实现。这一局限促使研究者提出近似谱聚类(ASC),但大多数ASC方法假定聚类数$k$已知。在实践中,人工设置$k$可能具有主观性且耗时。本文提出的算法在ASC的两个关键步骤中采用两种相关性度量来估计$k$:一种用于选择张成嵌入空间的特征向量,另一种则用于发现该空间中的聚类数。该算法采用生长型神经气(GNG)近似,GNG在保持输入数据拓扑结构方面具有优越性。实验结果表明,该算法具有高效性,并能与人工设定$k$值的同类方法相竞争。