This survey reviews a clustering method based on solving a convex optimization problem. Despite the plethora of existing clustering methods, convex clustering has several uncommon features that distinguish it from prior art. The optimization problem is free of spurious local minima, and its unique global minimizer is stable with respect to all its inputs, including the data, a tuning parameter, and weight hyperparameters. Its single tuning parameter controls the number of clusters and can be chosen using standard techniques from penalized regression. We give intuition into the behavior and theory for convex clustering as well as practical guidance. We highlight important algorithms and discuss how their computational costs scale with the problem size. Finally, we highlight the breadth of its uses and flexibility to be combined and integrated with other inferential methods.
翻译:本文综述了一种基于求解凸优化问题的聚类方法。尽管现有聚类方法众多,但凸聚类具有若干区别于传统方法的独特特性。该优化问题不存在伪局部极小值,其唯一全局极小值对包括数据、调节参数和权重超参数在内的所有输入均保持稳定。其单一调节参数可控制聚类数量,并可通过惩罚回归中的标准技术进行选择。本文深入阐释了凸聚类的行为机理与理论基础,并提供实践指导。重点分析了关键算法,并讨论了其计算成本随问题规模的变化规律。最后,我们展示了该方法广泛的应用场景及其与其他推理方法结合集成的灵活性。