We propose a novel nonparametric sequential test for composite hypotheses for means of multiple data streams. Our proposed method, \emph{peeking with expectation-based averaged capital} (PEAK), builds upon the testing-by-betting framework and provides a non-asymptotic $\alpha$-level test across any stopping time. Our contributions are two-fold: (1) we propose a novel betting scheme and provide theoretical guarantees on type-I error control, power, and asymptotic growth rate/$e$-power in the setting of a single data stream; (2) we introduce PEAK, a generalization of this betting scheme to multiple streams, that (i) avoids using wasteful union bounds via averaging, (ii) is a test of power one under mild regularity conditions on the sampling scheme of the streams, and (iii) reduces computational overhead when applying the testing-as-betting approaches for pure-exploration bandit problems. We illustrate the practical benefits of PEAK using both synthetic and real-world HeartSteps datasets. Our experiments show that PEAK provides up to an 85\% reduction in the number of samples before stopping compared to existing stopping rules for pure-exploration bandit problems, and matches the performance of state-of-the-art sequential tests while improving upon computational complexity.
翻译:我们提出了一种新颖的非参数序贯检验方法,用于处理多数据流均值的复合假设检验问题。所提出的方法——基于期望平均资本的窥探(PEAK)——建立在“通过博弈进行检验”的框架之上,能够在任意停止时间提供非渐近的α水平检验。我们的贡献主要体现在两个方面:(1)针对单数据流场景,我们提出了一种新的博弈策略,并从理论上保证了第一类错误控制、检验功效以及渐近增长率/e-功效;(2)我们提出了PEAK方法,将该博弈策略推广至多数据流场景,其具备以下特性:(i)通过平均化避免使用保守的并集界,(ii)在数据流采样方案满足温和正则性条件时具有功效为一的性质,(iii)在将“检验即博弈”方法应用于纯探索赌博机问题时能降低计算开销。我们通过合成数据与真实世界HeartSteps数据集展示了PEAK的实际优势。实验表明,在纯探索赌博机问题中,与现有停止规则相比,PEAK能使停止所需样本数量减少高达85%,同时在匹配最先进序贯检验方法性能的基础上进一步改善了计算复杂度。