Identifying replicable signals across different studies provides stronger scientific evidence and more powerful inference. Existing literature on high dimensional applicability analysis either imposes strong modeling assumptions or has low power. We develop a powerful and robust empirical Bayes approach for high dimensional replicability analysis. Our method effectively borrows information from different features and studies while accounting for heterogeneity. We show that the proposed method has better power than competing methods while controlling the false discovery rate, both empirically and theoretically. Analyzing datasets from the genome-wide association studies reveals new biological insights that otherwise cannot be obtained by using existing methods.
翻译:在不同研究之间识别可重复信号能够提供更强的科学证据和更有效的推断。现有关于高维可重复性分析的文献要么施加了较强的建模假设,要么统计功效较低。我们提出了一种用于高维可重复性分析的强力且稳健的经验贝叶斯方法。本方法在考虑异质性的同时,有效借用了不同特征和不同研究中的信息。我们证明,所提出的方法在控制错误发现率方面,无论从经验上还是理论上,均优于现有竞争方法。通过全基因组关联研究数据集的分析,我们揭示了现有方法无法获得的新的生物学洞见。