The goal of anomaly detection is to identify observations generated by a process that is different from a reference one. An accurate anomaly detector must ensure low false positive and false negative rates. However in the online context such a constraint remains highly challenging due to the usual lack of control of the False Discovery Rate (FDR). In particular the online framework makes it impossible to use classical multiple testing approaches such as the Benjamini-Hochberg (BH) procedure. Our strategy overcomes this difficulty by exploiting a local control of the ``modified FDR'' (mFDR). An important ingredient in this control is the cardinality of the calibration set used for computing empirical $p$-values, which turns out to be an influential parameter. It results a new strategy for tuning this parameter, which yields the desired FDR control over the whole time series. The statistical performance of this strategy is analyzed by theoretical guarantees and its practical behavior is assessed by simulation experiments which support our conclusions.
翻译:异常检测的目标是识别由与参考过程不同的过程生成的观测值。精确的异常检测器必须确保低假阳性率和假阴性率。然而,在线环境下,由于通常缺乏对错误发现率(FDR)的控制,这一约束仍极具挑战性。特别是,在线框架使得无法使用经典的多重检验方法,如Benjamini-Hochberg(BH)程序。我们的策略通过利用“修正FDR”(mFDR)的局部控制克服了这一困难。该控制中的一个重要因素是用于计算经验$p$值的校准集基数,这被证明是一个具有影响力的参数。由此产生了一种调整该参数的新策略,从而在整个时间序列上实现了所需的FDR控制。该策略的统计性能通过理论保证进行了分析,其实践行为通过模拟实验进行了评估,实验结果支持我们的结论。