Clustering in dynamic environments is of increasing importance, with broad applications ranging from real-time data analysis and online unsupervised learning to dynamic facility location problems. While meta-heuristics have shown promising effectiveness in static clustering tasks, their application for tracking optimal clustering solutions or robust clustering over time in dynamic environments remains largely underexplored. This is partly due to a lack of dynamic datasets with diverse, controllable, and realistic dynamic characteristics, hindering systematic performance evaluations of clustering algorithms in various dynamic scenarios. This deficiency leads to a gap in our understanding and capability to effectively design algorithms for clustering in dynamic environments. To bridge this gap, this paper introduces the Dynamic Dataset Generator (DDG). DDG features multiple dynamic Gaussian components integrated with a range of heterogeneous, local, and global changes. These changes vary in spatial and temporal severity, patterns, and domain of influence, providing a comprehensive tool for simulating a wide range of dynamic scenarios.
翻译:动态环境下的聚类正日益受到重视,其应用领域广泛,涵盖实时数据分析、在线无监督学习以及动态设施选址问题等。尽管元启发式方法在静态聚类任务中展现出显著有效性,但其在动态环境中追踪最优聚类解或实现鲁棒性聚类方面的应用仍鲜有探索。这在一定程度上归因于缺乏具有多样化、可控且符合实际动态特性的动态数据集,从而阻碍了聚类算法在不同动态场景下的系统性性能评估。这一不足导致我们在理解与设计动态环境聚类算法方面存在能力缺口。为填补这一空白,本文引入动态数据集生成器(DDG)。DDG集成了多个动态高斯分量,并融合了多种异构、局部与全局变化。这些变化在空间与时间强度、模式及影响域上呈现差异,为模拟广泛动态场景提供了综合性工具。