In contemporary research, data scientists often test an infinite sequence of hypotheses $H_1,H_2,\ldots $ one by one, and are required to make real-time decisions without knowing the future hypotheses or data. In this paper, we consider such an online multiple testing problem with the goal of providing simultaneous lower bounds for the number of true discoveries in data-adaptively chosen rejection sets. Using the (online) closure principle, we show that for this task it is necessary to use an anytime-valid test for each intersection hypothesis. Motivated by this result, we construct a new online closed testing procedure and a corresponding short-cut with a true discovery guarantee based on multiplying sequential e-values. This general but simple procedure gives uniform improvements over existing methods but also allows to construct entirely new and powerful procedures. In addition, we introduce new ideas for hedging and boosting of sequential e-values that provably increase power. Finally, we also propose the first online true discovery procedure for arbitrarily dependent e-values.
翻译:在当代研究中,数据科学家经常需要逐一检验无限序列的假设$H_1,H_2,\ldots$,并需在未知未来假设或数据的情况下做出实时决策。本文针对此类在线多重检验问题,旨在为数据自适应选择的拒绝集提供真实发现数量的同步下界。通过运用(在线)闭包原理,我们证明为此任务必须对每个交集假设使用任意时间有效的检验。基于这一结果,我们构建了一种新的在线闭包检验流程及其对应捷径,该流程通过累乘序贯e值实现真实发现保证。这种通用而简洁的流程不仅对现有方法实现了均匀改进,还能构建全新且高效的检验流程。此外,我们提出了序贯e值的对冲与增强新思路,这些方法可证明能提升检验功效。最后,我们还提出了首个适用于任意依赖e值的在线真实发现流程。