In this paper, we consider the task of clustering a set of individual time series while modeling each cluster, that is, model-based time series clustering. The task requires a parametric model with sufficient flexibility to describe the dynamics in various time series. To address this problem, we propose a novel model-based time series clustering method with mixtures of linear Gaussian state space models, which have high flexibility. The proposed method uses a new expectation-maximization algorithm for the mixture model to estimate the model parameters, and determines the number of clusters using the Bayesian information criterion. Experiments on a simulated dataset demonstrate the effectiveness of the method in clustering, parameter estimation, and model selection. The method is applied to real datasets commonly used to evaluate time series clustering methods. Results showed that the proposed method produces clustering results that are as accurate or more accurate than those obtained using previous methods.
翻译:本文研究了在对每个簇进行建模的同时对一组独立时间序列进行聚类的任务,即基于模型的时间序列聚类。该任务需要参数模型具备足够的灵活性来描述各类时间序列的动态特性。为解决这一问题,我们提出了一种基于高度灵活的线性高斯状态空间模型混合的新型时间序列聚类方法。该方法采用针对混合模型的新型期望最大化算法来估计模型参数,并利用贝叶斯信息准则确定簇的数量。在模拟数据集上的实验证明了该方法在聚类、参数估计和模型选择方面的有效性。通过将方法应用于评估时间序列聚类方法的常用真实数据集,结果表明:所提出的方法产生的聚类结果与现有方法相比,具有相同或更高的准确性。