Time-series clustering serves as a powerful data mining technique for time-series data in the absence of prior knowledge about clusters. A large amount of time-series data with large size has been acquired and used in various research fields. Hence, clustering method with low computational cost is required. Given that a quantum-inspired computing technology, such as a simulated annealing machine, surpasses conventional computers in terms of fast and accurately solving combinatorial optimization problems, it holds promise for accomplishing clustering tasks that are challenging to achieve using existing methods. This study proposes a novel time-series clustering method that leverages an annealing machine. The proposed method facilitates an even classification of time-series data into clusters close to each other while maintaining robustness against outliers. Moreover, its applicability extends to time-series images. We compared the proposed method with a standard existing method for clustering an online distributed dataset. In the existing method, the distances between each data are calculated based on the Euclidean distance metric, and the clustering is performed using the k-means++ method. We found that both methods yielded comparable results. Furthermore, the proposed method was applied to a flow measurement image dataset containing noticeable noise with a signal-to-noise ratio of approximately 1. Despite a small signal variation of approximately 2%, the proposed method effectively classified the data without any overlap among the clusters. In contrast, the clustering results by the standard existing method and the conditional image sampling (CIS) method, a specialized technique for flow measurement data, displayed overlapping clusters. Consequently, the proposed method provides better results than the other two methods, demonstrating its potential as a superior clustering method.
翻译:时间序列聚类是在缺乏关于聚类先验知识的情况下对时间序列数据进行处理的一种强大数据挖掘技术。随着各研究领域中获取和利用大量高维度时间序列数据的需求日益增长,开发低计算成本的聚类方法显得尤为重要。鉴于量子启发计算技术(如模拟退火机)在快速、准确解决组合优化问题方面超越传统计算机,它有望完成现有方法难以实现的聚类任务。本研究提出一种利用退火机的新型时间序列聚类方法。该方法能够在保持对异常值鲁棒性的同时,将时间序列数据均匀分类至彼此接近的簇中,且其应用范围可扩展至时间序列图像。我们将该方法与现有标准方法对在线分布式数据集的聚类性能进行了比较。现有方法基于欧几里得距离度量计算各数据点间的距离,并采用k-means++算法进行聚类。结果表明两种方法取得了相似的聚类效果。此外,我们将所提方法应用于信噪比约为1、包含显著噪声的流场测量图像数据集。尽管信号变化幅度仅约2%,该方法仍能有效完成数据分类且各簇间无重叠。相比之下,标准现有方法与条件图像采样方法(一种专用于流场测量数据的技术)的聚类结果均出现簇间重叠现象。因此,所提方法优于其他两种方法,展现出作为更优聚类方法的潜力。