For measuring the strength of visually-observed subpopulation differences, the Population Difference Criterion is proposed to assess the statistical significance of visually observed subpopulation differences. It addresses the following challenges: in high-dimensional contexts, distributional models can be dubious; in high-signal contexts, conventional permutation tests give poor pairwise comparisons. We also make two other contributions: Based on a careful analysis we find that a balanced permutation approach is more powerful in high-signal contexts than conventional permutations. Another contribution is the quantification of uncertainty due to permutation variation via a bootstrap confidence interval. The practical usefulness of these ideas is illustrated in the comparison of subpopulations of modern cancer data.
翻译:为衡量通过视觉观察到的子群体差异强度,提出“群体差异准则”,用于评估这些差异的统计显著性。该方法解决了以下挑战:在高维情境下,分布模型可能不可靠;在高信号情境下,传统的置换检验在配对比较中表现不佳。此外,我们还有两项其他贡献:通过细致分析发现,在高信号情境下,平衡置换方法比传统置换方法更具统计功效。另一项贡献是通过自助法置信区间量化由置换变异引起的不确定性。这些思想在实际应用中的有效性通过现代癌症数据中子群体的比较得以验证。