Significant recent progress has been made on deriving combination rules that can take as input a set of arbitrarily dependent p-values, and produce as output a single valid p-value. Here, we show that under the assumption of exchangeability of the p-values, many of those rules can be improved (made more powerful). While this observation by itself has practical implications (for example, under repeated tests involving data splitting), it also has implications for combining arbitrarily dependent p-values, since the latter can be made exchangeable by applying a uniformly random permutation. In particular, we derive several simple randomized combination rules for arbitrarily dependent p-values that are more powerful than their deterministic counterparts. For example, we derive randomized and exchangeable improvements of well known p-value combination rules like "twice the median" and "twice the average", as well as geometric and harmonic means. The main technical advance is to show that all these combination rules can be obtained by calibrating the p-values to e-values (using an $\alpha$-dependent calibrator), averaging those e-values, converting to a level $\alpha$ test using Markov's inequality, and finally obtaining p-values by combining this family of tests. The improvements are delivered via recent randomized and exchangeable variants of Markov's inequality.
翻译:近年来,在构建能处理任意依赖p值的组合规则并输出单一有效p值方面取得了显著进展。本文证明,在p值可交换性假设下,许多此类规则可以得到改进(变得更有效)。这一发现不仅具有实际应用价值(例如在涉及数据分割的重复检验中),还对组合任意依赖的p值具有重要意义——因为通过对p值施加均匀随机排列即可使其满足可交换性。特别地,我们提出了几种针对任意依赖p值的简单随机化组合规则,这些规则比其确定性版本更具效力。例如,我们对"两倍中位数""两倍平均值"等经典p值组合规则及其几何均值、调和均值形式,推导出了随机化和可交换性的改进方案。主要技术突破在于证明所有这些组合规则均可通过以下步骤实现:将p值校准为e值(使用α依赖校准器),对e值取平均,通过马尔可夫不等式转化为显著性水平为α的检验,最后组合该检验族获得p值。改进结果通过近期提出的随机化与可交换性马尔可夫不等式变体实现。