We present Kernel-QuantTree Exponentially Weighted Moving Average (KQT-EWMA), a non-parametric change-detection algorithm that combines the Kernel-QuantTree (KQT) histogram and the EWMA statistic to monitor multivariate data streams online. The resulting monitoring scheme is very flexible, since histograms can be used to model any stationary distribution, and practical, since the distribution of test statistics does not depend on the distribution of datastream in stationary conditions (non-parametric monitoring). KQT-EWMA enables controlling false alarms by operating at a pre-determined Average Run Length ($ARL_0$), which measures the expected number of stationary samples to be monitored before triggering a false alarm. The latter peculiarity is in contrast with most non-parametric change-detection tests, which rarely can control the $ARL_0$ a priori. Our experiments on synthetic and real-world datasets demonstrate that KQT-EWMA can control $ARL_0$ while achieving detection delays comparable to or lower than state-of-the-art methods designed to work in the same conditions.
翻译:本文提出核分位数树指数加权移动平均(KQT-EWMA),这是一种非参数变化检测算法,通过结合核分位数树(KQT)直方图与EWMA统计量,实现对多变量数据流的在线监测。该监测方案具有高度灵活性——直方图可用于建模任意平稳分布,同时兼具实用性——在平稳条件下,检验统计量的分布不依赖于数据流的分布(非参数监测)。KQT-EWMA通过以预设的平均运行长度($ARL_0$)运行来控制误报率,该指标衡量了在触发误报前预期需监测的平稳样本数量。这一特性与大多数非参数变化检测检验形成鲜明对比,后者很少能先验地控制$ARL_0$。我们在合成数据集和真实数据集上的实验表明,KQT-EWMA在实现与同类先进方法相当或更低的检测延迟的同时,能够有效控制$ARL_0$。