In this paper, a novel method to perform model-based clustering of time series is proposed. The procedure relies on two iterative steps: (i) K global forecasting models are fitted via pooling by considering the series pertaining to each cluster and (ii) each series is assigned to the group associated with the model producing the best forecasts according to a particular criterion. Unlike most techniques proposed in the literature, the method considers the predictive accuracy as the main element for constructing the clustering partition, which contains groups jointly minimizing the overall forecasting error. Thus, the approach leads to a new clustering paradigm where the quality of the clustering solution is measured in terms of its predictive capability. In addition, the procedure gives rise to an effective mechanism for selecting the number of clusters in a time series database and can be used in combination with any class of regression model. An extensive simulation study shows that our method outperforms several alternative techniques concerning both clustering effectiveness and predictive accuracy. The approach is also applied to perform clustering in several datasets used as standard benchmarks in the time series literature, obtaining great results.
翻译:本文提出了一种新颖的基于模型的时间序列聚类方法。该方法依赖于两个迭代步骤:(i)通过整合每个聚类所对应的时间序列,拟合K个全局预测模型;(ii)根据特定准则,将每个时间序列分配到能产生最优预测的模型所在的组。与文献中提出的大多数技术不同,该方法将预测准确性作为构建聚类划分的核心要素,所生成的聚类组能够联合最小化整体预测误差。因此,该方法开创了一种新的聚类范式,其中聚类解的质量以其预测能力来衡量。此外,该过程为时间序列数据库中的聚类数量选择提供了有效机制,并且可与任何类型的回归模型结合使用。一项广泛的仿真研究表明,就聚类效果和预测准确性而言,我们的方法优于多种替代技术。该方法还被应用于多个时间序列文献中作为标准基准的数据集进行聚类分析,取得了优异的结果。