We revisit sequential outlier hypothesis testing and derive bounds on the achievable exponents. Specifically, the task of outlier hypothesis testing is to identify the set of outliers that are generated from an anomalous distribution among all observed sequences where most are generated from a nominal distribution. In the sequential setting, one obtains a sample from each sequence per unit time until a reliable decision could be made. We assume that the number of outliers is known while both the nominal and anomalous distributions are unknown. For the case of exactly one outlier, our bounds on the achievable exponents are tight, providing exact large deviations characterization of sequential tests and strengthening a previous result of Li, Nitinawarat and Veeravalli (2017). In particular, we propose a sequential test that has bounded average sample size and better theoretical performance than the fixed-length test, which could not be guaranteed by the corresponding sequential test of Li, Nitinawarat and Veeravalli (2017). Our results are also generalized to the case of multiple outliers.
翻译:我们重新审视序列异常假设检验问题,并推导了可达指数界的上界与下界。具体而言,异常假设检验的任务是在大多数观测序列服从名义分布的情况下,识别出由异常分布生成的异常子集。在序列化设置中,每个单位时间从每条序列获取一个样本,直至能做出可靠判决。假设异常数量已知,但名义分布与异常分布均未知。对于恰好存在一个异常的情形,我们给出的可达指数界是紧的,从而提供了序列检验的精确大偏差刻画,并改进了Li、Nitinawarat与Veeravalli(2017)的先前结论。特别地,我们提出了一种具有有界平均样本量的序列检验方法,其理论性能优于固定长度检验,而Li等人的对应序列检验无法保证这一特性。我们的结果进一步推广至多个异常的情形。