Time-series clustering serves as a powerful data mining technique for time-series data in the absence of prior knowledge about clusters. A large amount of time-series data with large size has been acquired and used in various research fields. Hence, clustering method with low computational cost is required. Given that a quantum-inspired computing technology, such as a simulated annealing machine, surpasses conventional computers in terms of fast and accurately solving combinatorial optimization problems, it holds promise for accomplishing clustering tasks that are challenging to achieve using existing methods. This study proposes a novel time-series clustering method that leverages an annealing machine. The proposed method facilitates an even classification of time-series data into clusters close to each other while maintaining robustness against outliers. Moreover, its applicability extends to time-series images. We compared the proposed method with a standard existing method for clustering an online distributed dataset. In the existing method, the distances between each data are calculated based on the Euclidean distance metric, and the clustering is performed using the k-means++ method. We found that both methods yielded comparable results. Furthermore, the proposed method was applied to a flow measurement image dataset containing noticeable noise with a signal-to-noise ratio of approximately 1. Despite a small signal variation of approximately 2%, the proposed method effectively classified the data without any overlap among the clusters. In contrast, the clustering results by the standard existing method and the conditional image sampling (CIS) method, a specialized technique for flow measurement data, displayed overlapping clusters. Consequently, the proposed method provides better results than the other two methods, demonstrating its potential as a superior clustering method.
翻译:时间序列聚类是在缺乏聚类先验知识的情况下,对时间序列数据进行有效数据挖掘的技术。随着各研究领域获取和使用的海量时间序列数据规模日益庞大,开发低计算成本的聚类方法成为迫切需求。由于量子启发计算技术(如模拟退火机)在快速精确求解组合优化问题方面超越传统计算机,因此有望实现现有方法难以完成的聚类任务。本研究提出一种基于退火机的新型时间序列聚类方法,该方法能够在保持对异常值鲁棒性的同时,实现对时间序列数据的均匀邻近聚类。此外,该方法还可扩展应用于时间序列图像。我们将所提方法与标准现有方法在在线分布式数据集上进行了聚类对比实验。现有方法采用欧氏距离度量计算数据点间距离,并基于k-means++方法完成聚类,结果显示两种方法性能相当。进一步将所提方法应用于信噪比约为1、噪声显著的流量测量图像数据集,尽管信号变化幅度仅约2%,所提方法仍能有效实现无重叠聚类。相比之下,标准现有方法及专门针对流量测量数据的条件图像采样(CIS)方法均产生了聚类重叠现象。实验表明,所提方法优于其他两种方法,展现出作为更优聚类方法的潜力。