Economic policy and research rely on the correct evaluation of the billions of high-frequency data points that we collect every day. Consistent clustering algorithms, like DBSCAN, allow us to make sense of the data in a useful way. However, while there is a large literature on the consistency of various clustering algorithms for high-dimensional static clustering, the literature on multivariate time series clustering still largely relies on heuristics or restrictive assumptions. The aim of this paper is to prove a notion of consistency of DBSCAN for the task of clustering multivariate time series.
翻译:经济政策与研究依赖于对我们每日收集的数十亿高频数据点的正确评估。像DBSCAN这样的一致性聚类算法,使我们能够以有用的方式理解这些数据。然而,尽管已有大量文献探讨了各种聚类算法在高维静态聚类中的一致性,但关于多元时间序列聚类的研究仍主要依赖于启发式方法或限制性假设。本文旨在证明DBSCAN在多元时间序列聚类任务中的一致性概念。