This paper explores the multiple testing problem for sparse high-dimensional data with binary outcomes. We utilize the empirical Bayes posterior to construct multiple testing procedures and evaluate their performance on false discovery rate (FDR) control. We first show that the $\ell$-value (a.k.a. the local FDR) procedure can be overly conservative in estimating the FDR if choosing the conjugate spike and uniform slab prior. To address this, we propose two new procedures that calibrate the posterior to achieve correct FDR control. Sharp frequentist theoretical results are established for these procedures, and numerical experiments are conducted to validate our theory in finite samples. To the best of our knowledge, we obtain the first {\it uniform} FDR control result in multiple testing for high-dimensional data with binary outcomes under the sparsity assumption.
翻译:本文探讨了稀疏高维二元结果数据的多重检验问题。我们利用经验贝叶斯后验构建多重检验程序,并评估其在错误发现率控制方面的性能。首先证明若选择共轭尖峰均匀厚板先验,$\ell$值(即局部错误发现率)程序在估计错误发现率时可能过于保守。为此,我们提出两种新程序,通过校准后验分布实现正确的错误发现率控制。为这些程序建立了严格的频率学派理论结果,并通过数值实验验证了有限样本下的理论有效性。据我们所知,这是在稀疏性假设下首次获得高维二元结果数据多重检验中具有一致性的错误发现率控制结果。