A novel elastic time distance for sparse multivariate functional data is proposed and used to develop a robust distance-based two-layer partition clustering method. With this proposed distance, the new approach not only can detect correct clusters for sparse multivariate functional data under outlier settings but also can detect those outliers that do not belong to any clusters. Classical distance-based clustering methods such as density-based spatial clustering of applications with noise (DBSCAN), agglomerative hierarchical clustering, and $K$-medoids are extended to the sparse multivariate functional case based on the newly-proposed distance. Numerical experiments on simulated data highlight that the performance of the proposed algorithm is superior to the performances of existing model-based and extended distance-based methods. The effectiveness of the proposed approach is demonstrated using Northwest Pacific cyclone tracks data as an example.
翻译:针对稀疏多元函数数据,提出一种新型弹性时间距离,并基于该距离开发了稳健的基于距离的双层划分聚类方法。借助所提出的距离,该方法不仅能在存在异常值的情况下正确识别稀疏多元函数数据的聚类结构,还能检测出不属于任何聚类的异常点。基于新提出的距离,将经典的基于距离的聚类方法(如基于密度的含噪声空间聚类方法DBSCAN、凝聚层次聚类以及$K$-中心点聚类)扩展至稀疏多元函数场景。模拟数据上的数值实验表明,所提算法的性能优于现有基于模型及扩展的基于距离的方法。以西北太平洋气旋轨迹数据为例,验证了所提方法的有效性。