A new method for clustering functional data is proposed via information maximization. The proposed method learns a probabilistic classifier in an unsupervised manner so that mutual information (or squared loss mutual information) between data points and cluster assignments is maximized. A notable advantage of this proposed method is that it only involves continuous optimization of model parameters, which is simpler than discrete optimization of cluster assignments and avoids the disadvantages of generative models. Unlike some existing methods, the proposed method does not require estimating the probability densities of Karhunen-Lo`eve expansion scores under different clusters and also does not require the common eigenfunction assumption. The empirical performance and the applications of the proposed methods are demonstrated by simulation studies and real data analyses. In addition, the proposed method allows for out-of-sample clustering, and its effect is comparable with that of some supervised classifiers.
翻译:本文提出了一种基于信息最大化的函数型数据聚类新方法。该方法以无监督方式学习概率分类器,通过最大化数据点与聚类分配之间的互信息(或平方损失互信息)实现聚类。本方法的一个显著优势在于仅涉及模型参数的连续优化,相较于聚类分配的离散优化更为简洁,且避免了生成式模型的弊端。与现有某些方法不同,本方法无需估计不同聚类下Karhunen-Loève展开得分的概率密度,亦无需假定公共特征函数。通过模拟研究与真实数据分析,验证了所提方法的实证性能与应用效果。此外,本方法支持样本外聚类,其效果可与部分监督分类器相媲美。