Comparative meta-analyses of groups of subjects by integrating multiple observational studies rely on estimated propensity scores (PSs) to mitigate covariate imbalances. However, PS estimation grapples with the theoretical and practical challenges posed by high-dimensional covariates. Motivated by an integrative analysis of breast cancer patients across seven medical centers, this paper tackles the challenges associated with integrating multiple observational datasets. The proposed inferential technique, called Bayesian Motif Submatrices for Covariates (B-MSC), addresses the curse of dimensionality by a hybrid of Bayesian and frequentist approaches. B-MSC uses nonparametric Bayesian "Chinese restaurant" processes to eliminate redundancy in the high-dimensional covariates and discover latent motifs or lower-dimensional structure. With these motifs as potential predictors, standard regression techniques can be utilized to accurately infer the PSs and facilitate covariate-balanced group comparisons. Simulations and meta-analysis of the motivating cancer investigation demonstrate the efficacy of the B-MSC approach to accurately estimate the propensity scores and efficiently address covariate imbalance when integrating observational health studies with high-dimensional covariates.
翻译:通过整合多项观察性研究对受试者群体进行比较性荟萃分析时,需依赖估计的倾向得分来缓解协变量不平衡。然而,倾向得分估计面临高维协变量带来的理论与实践挑战。受一项整合七个医学中心乳腺癌患者数据的启发,本文致力于解决多项观察性数据集整合中的相关难题。所提出的推断技术——称为协变量贝叶斯模体子矩阵(B-MSC),通过贝叶斯与频率学派方法的混合策略应对维度灾难。B-MSC采用非参数贝叶斯"中国餐馆"过程消除高维协变量中的冗余性,并发现潜在模体或低维结构。以这些模体作为潜在预测变量,可利用标准回归技术准确推断倾向得分,从而促进协变量平衡的组间比较。模拟实验及对激励性癌症研究的荟萃分析表明,在整合具有高维协变量的观察性健康研究时,B-MSC方法能有效估计倾向得分并高效处理协变量不平衡问题。