In this paper, we prove the strong consistency of the sparse K-means method proposed by Witten and Tibshirani (2010). We prove the consistency in both risk and clustering for the Euclidean distance. We discuss the characterization of the limit of the clustering under some special cases. For the general distance, we prove the consistency in risk. Our result naturally extends to other models with the same objective function but different constraints such as l0 or l1 penalty in Chang et al. (2018).
翻译:本文证明了Witten和Tibshirani(2010)提出的稀疏K均值方法的强一致性。我们证明了该方法在欧氏距离下风险与聚类结果的一致性,并讨论了若干特殊情形下聚类极限的表征。对于一般距离,我们证明了风险一致性。我们的结果自然可推广至具有相同目标函数但采用不同约束(如Chang等人(2018)中l0或l1惩罚项)的其他模型。