A new online multiple testing procedure is described in the context of anomaly detection, which controls the False Discovery Rate (FDR). An accurate anomaly detector must control the false positive rate at a prescribed level while keeping the false negative rate as low as possible. However in the online context, such a constraint remains highly challenging due to the usual lack of FDR control: the online framework makes it impossible to use classical multiple testing approaches such as the Benjamini-Hochberg (BH) procedure, which would require knowing the entire time series. The developed strategy relies on exploiting the local control of the ``modified FDR'' (mFDR) criterion. It turns out that the local control of mFDR enables global control of the FDR over the full series up to additional modifications of the multiple testing procedures. An important ingredient in this control is the cardinality of the calibration dataset used to compute the empirical p-values. A dedicated strategy for tuning this parameter is designed for achieving the prescribed FDR control over the entire time series. The good statistical performance of the full strategy is analyzed by theoretical guarantees. Its practical behavior is assessed by several simulation experiments which support our conclusions.
翻译:本文在异常检测背景下描述了一种新的在线多重检验程序,该程序能够控制错误发现率(FDR)。一个精确的异常检测器必须在将假阳性率控制在预定水平的同时,尽可能降低假阴性率。然而,在线场景下这一约束仍极具挑战性,主要源于通常缺乏FDR控制:在线框架使得无法使用经典的多重检验方法(如Benjamini-Hochberg(BH)程序),因为这些方法需要获知完整的时间序列。所提出的策略依赖于对“修正FDR”(mFDR)准则的局部控制。结果表明,通过对多重检验程序进行额外修正,mFDR的局部控制能够实现整个序列上的全局FDR控制。该控制中的一个关键要素是用于计算经验p值的校准数据集基数。本文设计了针对该参数的专用调优策略,以实现整个时间序列上的预定FDR控制。通过理论保证分析了完整策略的良好统计性能,并通过多组模拟实验评估了其实用表现,实验结果支持了本文的结论。