Minimally Discrete and Minimally Randomized p-Values

In meta analysis, multiple hypothesis testing and many other methods, p-values are utilized as inputs and assumed to be uniformly distributed over the unit interval under the null hypotheses. If data used to generate p-values have discrete distributions then either natural, mid- or randomized p-values are typically utilized. Natural and mid-p-values can allow for valid, albeit conservative, downstream methods since under the null hypothesis they are dominated by uniform distributions in the stochastic and convex order, respectively. Randomized p-values need not lead to conservative procedures since they permit a uniform distributions under the null hypotheses through the generation of independent auxiliary variates. However, the auxiliary variates necessarily add variation to procedures. This manuscript introduces and studies ``minimally discrete'' (MD) natural p-values, MD mid-p-values and ``minimally randomized'' (MR) p-values. It is shown that MD p-values dominate their non-MD counterparts in the stochastic and convex order, and hence lead to less conservative, yet still valid, downstream methods. Likewise, MR p-values dominate their non-MR counterparts in that they are still uniformly distributed under the null hypotheses, but the added variation attributable to the independently generated auxiliary variate is smaller. It is anticipated that results here will facilitate the construction of new meta-analysis and multiple testing methods via more efficient p-value construction, and facilitate theoretical study of existing and new methods by establishing gold standards for addressing the unavoidable detrimental ``discreteness effect''.

翻译：在元分析、多重假设检验及众多其他方法中，p值常被用作输入量，并假定其在原假设下服从单位区间上的均匀分布。若用于生成p值的原始数据具有离散分布，则通常采用自然p值、中位p值或随机化p值。自然p值与中位p值虽能保证下游方法的有效性（尽管趋于保守），因为它们在原假设下分别受随机序和凸序意义下的均匀分布所控制。随机化p值通过生成独立的辅助变量，可在原假设下实现精确的均匀分布，从而避免保守性，但辅助变量的引入必然增加方法的变异度。本文提出并研究了"极小离散"自然p值、极小离散中位p值以及"极小随机化"p值。研究证明，极小离散p值在随机序与凸序意义上均优于其非极小离散版本，因此能在保持有效性的同时降低下游方法的保守性。类似地，极小随机化p值虽仍保持原假设下的均匀分布特性，但其因独立生成辅助变量所引入的额外变异度更小，故优于非极小随机化版本。预期本研究结果将通过更高效的p值构建方式促进新型元分析与多重检验方法的开发，并通过建立应对不可避免的"离散效应"的黄金标准，推动现有及新兴方法的理论研究。