Hypothesis testing of structure in correlation and covariance matrices is of broad interest in many application areas. In high dimensions and/or small to moderate sample sizes, high error rates in testing is a substantial concern. This article focuses on increasing power through a frequentist assisted by Bayes (FAB) procedure. This FAB approach boosts power by including prior information on the correlation parameters. In particular, we suppose there is one of two sources of prior information: (i) a prior dataset that is distinct from the current data but related enough that it may contain valuable information about the correlation structure in the current data; and (ii) knowledge about a tendency for the correlations in different parameters to be similar so that it is appropriate to consider a hierarchical model. When the prior information is relevant, the proposed FAB approach can have significant gains in power. A divide-and-conquer algorithm is developed to reduce computational complexity in massive testing dimensions. We show improvements in power for detecting correlated gene pairs in genomic studies while maintaining control of Type I error or false discover rate (FDR).
翻译:相关性矩阵与协方差矩阵中的结构假设检验在众多应用领域中具有广泛意义。在高维度和/或中小样本量的情况下,检验过程中的高错误率是一个重要问题。本文聚焦于通过一种贝叶斯辅助频率学派的方法来提升统计功效。该FAB方法通过纳入相关性参数的先验信息来增强检验功效。具体而言,我们假设存在两种先验信息来源之一:(i) 一个与当前数据不同但足够相关、可能包含当前数据相关性结构有价值信息的先验数据集;(ii) 关于不同参数间相关性具有相似趋势的认知,使得采用分层模型具有合理性。当先验信息相关时,所提出的FAB方法能显著提升统计功效。本文开发了一种分治算法以降低海量检验维度下的计算复杂度。我们展示了在基因组学研究中检测相关基因对时统计功效的提升,同时保持了对第一类错误或错误发现率的控制。