Active Hypothesis Testing under Computational Budgets with Applications to GWAS and LLM

In large-scale hypothesis testing, computing exact $p$-values or $e$-values is often resource-intensive, creating a need for budget-aware inferential methods. We propose a general framework for active hypothesis testing that leverages inexpensive auxiliary statistics to allocate a global computational budget. For each hypothesis, our data-adaptive procedure probabilistically decides whether to compute the exact test statistic or a transformed proxy, guaranteeing a valid $p$-value or $e$-value while satisfying the exact budget constraint. Theoretical guarantees are established for our constructions, showing that the procedure achieves optimality for $e$-values and for $p$-values under independence, and admissibility for $p$-values under general dependence. Empirical results from simulations and two real-world applications, including a large-scale genome-wide association study (GWAS) and a clinical prediction task leveraging large language models (LLM), demonstrate that our framework improves statistical efficiency under fixed resource limits.

翻译：在大规模假设检验中，精确计算$p$值或$e$值通常需要大量资源，因此需要预算感知的推断方法。我们提出了一种主动假设检验的通用框架，该框架利用廉价的辅助统计量来分配全局计算预算。对于每个假设，我们的数据自适应程序以概率方式决定是计算精确检验统计量还是转换后的代理统计量，从而在满足精确预算约束的同时保证有效的$p$值或$e$值。我们为所提方法建立了理论保证，表明该程序在独立条件下对$e$值和$p$值均达到最优性，且在一般相依条件下对$p$值具有可采纳性。来自模拟实验和两个实际应用（包括大规模全基因组关联研究（GWAS）和利用大语言模型（LLM）的临床预测任务）的实证结果表明，我们的框架在固定资源限制下提高了统计效率。

相关内容

假设检验

关注 8

假设检验是推论统计中用于检验统计假设的一种方法。而“统计假设”是可通过观察一组随机变量的模型进行检验的科学假说。一旦能估计未知参数，就会希望根据结果对未知的真正参数值做出适当的推论。统计上对参数的假设，就是对一个或多个参数的论述。而其中欲检验其正确性的为零假设（null hypothesis），零假设通常由研究者决定，反映研究者对未知参数的看法。相对于零假设的其他有关参数之论述是备择假设（alternative hypothesis），它通常反应了执行检定的研究者对参数可能数值的另一种（对立的）看法（换句话说，备择假设通常才是研究者最想知道的）。假设检验的种类包括：t检验，Z检验，卡方检验，F检验等等。

【博士论文】《通过提前退出算法加速大语言模型推理》

专知会员服务

13+阅读 · 2025年9月9日

《假新闻检测的特征计算流程：基于大语言模型的提取方法》

专知会员服务

15+阅读 · 2025年7月3日

RAG+LLM=？同济大学等最新《大型语言模型的检索增强生成》综述

专知会员服务

111+阅读 · 2023年12月19日

【牛津大学博士论文】基于数据驱动的金融时间序列模拟和预测方法，238页pdf

专知会员服务

62+阅读 · 2023年9月4日