This survey reviews a clustering method based on solving a convex optimization problem. Despite the plethora of existing clustering methods, convex clustering has several uncommon features that distinguish it from prior art. The optimization problem is free of spurious local minima, and its unique global minimizer is stable with respect to all its inputs, including the data, a tuning parameter, and weight hyperparameters. Its single tuning parameter controls the number of clusters and can be chosen using standard techniques from penalized regression. We give intuition into the behavior and theory for convex clustering as well as practical guidance. We highlight important algorithms and give insight into how their computational costs scale with the problem size. Finally, we highlight the breadth of its uses and flexibility to be combined and integrated with other inferential methods.
翻译:本综述探讨了一种基于求解凸优化问题的聚类方法。尽管现有聚类方法众多,凸聚类具有若干区别于现有技术的独特特征。该优化问题不存在伪局部极小值,其唯一全局极小值对包括数据、调节参数和权重超参数在内的所有输入均保持稳定。其单一调节参数可控制聚类数量,并可采用惩罚回归中的标准技术进行选择。本文阐释了凸聚类的行为机理与理论基础,并提供实践指导。我们重点分析了关键算法,并深入探讨了其计算成本随问题规模变化的规律。最后,我们展示了该方法广泛的应用场景,以及与其他推理方法结合集成的灵活性。