Approximate spectral clustering (ASC) was developed to overcome heavy computational demands of spectral clustering (SC). It maintains SC ability in predicting non-convex clusters. Since it involves a preprocessing step, ASC defines new similarity measures to assign weights on graph edges. Connectivity matrix (CONN) is an efficient similarity measure to construct graphs for ASC. It defines the weight between two vertices as the number of points assigned to them during vector quantization training. However, this relationship is undirected, where it is not clear which of the vertices is contributing more to that edge. Also, CONN could be tricked by noisy density between clusters. We defined a directed version of CONN, named DCONN, to get insights on vertices contributions to edges. Also, we provided filtering schemes to ensure CONN edges are highlighting potential clusters. Experiments reveal that the proposed filtering was highly efficient when noise cannot be tolerated by CONN.
翻译:近似谱聚类(ASC)旨在克服谱聚类(SC)计算开销大的问题,同时保留其预测非凸聚类的能力。由于包含预处理步骤,ASC需定义新的相似度度量来为图边赋予权重。连接矩阵(CONN)是一种为ASC构建图的高效相似度度量,它将两个顶点之间的权重定义为向量量化训练过程中分配给这两个顶点的数据点数量。然而,这种关系是无向的,无法明确哪个顶点对该边的贡献更大。此外,跨聚类间的噪声密度可能干扰CONN的判定。我们定义了CONN的有向版本DCONN,以深入探究顶点对边的贡献。同时,我们提出了滤波方案,确保CONN的边能突出潜在的聚类结构。实验表明,当CONN无法容忍噪声时,所提出的滤波方法具有高效性。