Motivated by applications in tissue-wide association studies (TWAS), we develop a flexible and theoretically grounded empirical Bayes approach for integrating %vector-valued outcomes data obtained from different sources. We propose a linear shrinkage estimator that effectively shrinks singular values of a data matrix. This problem is closely connected to estimating covariance matrices under a specific loss, for which we develop asymptotically optimal estimators. The basic linear shrinkage estimator is then extended to a local linear shrinkage estimator, offering greater flexibility. Crucially, the proposed method works under sparse/dense or low-rank/non low-rank parameter settings unlike well-known sparse or reduced rank estimators in the literature. Furthermore, the empirical Bayes approach offers greater scalability in computation compared to intensive full Bayes procedures. The method is evaluated through an extensive set of numerical experiments, and applied to a real TWAS data obtained from the Genotype-Tissue Expression (GTEx) project.
翻译:受组织范围关联研究(TWAS)应用的启发,我们开发了一种灵活且理论依据充分的经验贝叶斯方法,用于整合从不同来源获得的向量值结果数据。我们提出了一种线性收缩估计器,能有效收缩数据矩阵的奇异值。该问题与特定损失函数下的协方差矩阵估计密切相关,为此我们开发了渐近最优估计器。基础线性收缩估计器进一步扩展为局部线性收缩估计器,提供了更强的灵活性。关键的是,与文献中常见的稀疏或降秩估计器不同,所提方法在稀疏/稠密或低秩/非低秩参数设置下均能有效工作。此外,经验贝叶斯方法相比计算密集的全贝叶斯程序具有更好的可扩展性。通过大量数值实验对该方法进行了评估,并将其应用于从基因型-组织表达(GTEx)项目获取的真实TWAS数据。