We revisit sequential outlier hypothesis testing and derive bounds on the achievable exponents. Specifically, the task of outlier hypothesis testing is to identify the set of outliers that are generated from an anomalous distribution among all observed sequences where most are generated from a nominal distribution. In the sequential setting, one obtains a sample from each sequence per unit time until a reliable decision could be made. We assume that the number of outliers is known while both the nominal and anomalous distributions are unknown. For the case of exactly one outlier, our bounds on the achievable exponents are tight, providing exact large deviations characterization of sequential tests and strengthening a previous result of Li, Nitinawarat and Veeravalli (2017). In particular, we propose a sequential test that has bounded average sample size and better theoretical performance than the fixed-length test, which could not be guaranteed by the corresponding sequential test of Li, Nitinawarat and Veeravalli (2017). Our results are also generalized to the case of multiple outliers.
翻译:本文重新审视序贯异常值假设检验问题,并推导可达指数的边界。具体而言,异常值假设检验的任务是在大多数序列由名义分布生成的情况下,识别出所有由异常分布生成的异常值序列集。在序贯设定中,每个时间单位从每个序列中获取一个样本,直至能做出可靠决策。我们假设异常值数量已知,而名义分布与异常分布均未知。当仅存在单个异常值时,本文对可达指数的边界是紧的,从而为序贯检验提供了精确的大偏差刻画,并强化了Li、Nitinawarat和Veeravalli(2017)的先前结果。特别地,我们提出了一种平均样本量有界且理论性能优于固定长度检验的序贯检验,而该性质无法由Li、Nitinawarat和Veeravalli(2017)的相应序贯检验保证。本文结果还推广至多个异常值的情形。