The growth of network-connected devices has led to an exponential increase in data generation, creating significant challenges for efficient data analysis. This data is generated continuously, creating a dynamic flow known as a data stream. The characteristics of a data stream may change dynamically, and this change is known as concept drift. Consequently, a method for handling data streams must efficiently reduce their volume while dynamically adapting to these changing characteristics. This paper proposes a simple online vector quantization method for concept drift. The proposed method identifies and replaces units with low win probability through remove-birth updating, thus achieving a rapid adaptation to concept drift. Furthermore, the results of this study show that the proposed method can generate minimal dead units even in the presence of concept drift. This study also suggests that some metrics calculated from the proposed method will be helpful for drift detection.
翻译:网络连接设备的增长导致数据生成呈指数级增加,给高效数据分析带来了重大挑战。这些数据连续生成,形成了被称为数据流的动态流。数据流的特征可能动态变化,这种变化被称为概念漂移。因此,处理数据流的方法必须能够在高效缩减数据量的同时,动态适应这些变化的特征。本文提出了一种针对概念漂移的简单在线向量量化方法。该方法通过移除-生成更新识别并替换具有低获胜概率的单元,从而快速适应概念漂移。此外,研究结果表明,即使存在概念漂移,所提方法也能生成最少的死亡单元。本研究还表明,从该方法计算出的某些指标将有助于漂移检测。