Clustering in dynamic environments is of increasing importance, with broad applications ranging from real-time data analysis and online unsupervised learning to dynamic facility location problems. While meta-heuristics have shown promising effectiveness in static clustering tasks, their application for tracking optimal clustering solutions or robust clustering over time in dynamic environments remains largely underexplored. This is partly due to a lack of dynamic datasets with diverse, controllable, and realistic dynamic characteristics, hindering systematic performance evaluations of clustering algorithms in various dynamic scenarios. This deficiency leads to a gap in our understanding and capability to effectively design algorithms for clustering in dynamic environments. To bridge this gap, this paper introduces the Dynamic Dataset Generator (DDG). DDG features multiple dynamic Gaussian components integrated with a range of heterogeneous, local, and global changes. These changes vary in spatial and temporal severity, patterns, and domain of influence, providing a comprehensive tool for simulating a wide range of dynamic scenarios.
翻译:动态环境中的聚类日益重要,其广泛应用涵盖实时数据分析、在线无监督学习及动态设施选址问题等领域。尽管元启发式算法在静态聚类任务中展现了显著有效性,但其在动态环境中跟踪最优聚类解或实现随时间演化的鲁棒聚类的研究仍相对匮乏。这一现状部分源于缺乏兼具多样性、可控性与现实动态特性的动态数据集,从而阻碍了聚类算法在不同动态场景下的系统性性能评估。该缺陷导致我们在理解与设计动态环境聚类算法方面存在认知与能力的鸿沟。为弥补这一不足,本文提出动态数据集生成器(DDG)。DDG 集成了多个动态高斯分量,并融合了范围广泛的异质性局部与全局变化。这些变化在时空严重程度、变化模式及影响域方面存在差异,为模拟多种动态场景提供了综合工具。