Deep clustering methods improve the performance of clustering tasks by jointly optimizing deep representation learning and clustering. While numerous deep clustering algorithms have been proposed, most of them rely on artificially constructed pseudo targets for performing clustering. This construction process requires some prior knowledge, and it is challenging to determine a suitable pseudo target for clustering. To address this issue, we propose a deep embedding clustering algorithm driven by sample stability (DECS), which eliminates the requirement of pseudo targets. Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability. The sample stability aims to explore the deterministic relationship between samples and all cluster centroids, pulling samples to their respective clusters and keeping them away from other clusters with high determinacy. We analyzed the convergence of the loss using Lipschitz continuity in theory, which verifies the validity of the model. The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.
翻译:深度聚类方法通过联合优化深度表示学习与聚类过程,显著提升了聚类任务的性能。尽管已有大量深度聚类算法被提出,但多数方法依赖人为构建的伪目标来执行聚类。这种构建过程需要一定先验知识,且难以确定合适的聚类伪目标。针对该问题,本文提出一种由样本稳定性驱动的深度嵌入聚类算法(DECS),该算法无需伪目标约束。具体而言,我们首先通过自编码器构建初始特征空间,随后在样本稳定性约束下学习面向聚类的嵌入特征。样本稳定性旨在探索样本与所有聚类中心之间的确定性关系,将样本以高确定性拉向其所属簇群,同时远离其他簇群。我们利用Lipschitz连续性从理论上分析了损失函数的收敛性,验证了模型的有效性。在五个数据集上的实验结果表明,与当前最先进的聚类方法相比,本方法取得了更优的性能。