Comparative meta-analyses of patient groups by integrating multiple observational studies rely on estimated propensity scores (PSs) to mitigate confounder imbalances. However, PS estimation grapples with the theoretical and practical challenges posed by high-dimensional confounders. Motivated by an integrative analysis of breast cancer patients across seven medical centers, this paper tackles the challenges associated with integrating multiple observational datasets and offering nationally interpretable results. The proposed inferential technique, called Bayesian Motif Submatrices for Confounders (B-MSMC), addresses the curse of dimensionality by a hybrid of Bayesian and frequentist approaches. B-MSMC uses nonparametric Bayesian ``Chinese restaurant" processes to eliminate redundancy in the high-dimensional confounders and discover latent motifs or lower-dimensional structure. With these motifs as potential predictors, standard regression techniques can be utilized to accurately infer the PSs and facilitate causal group comparisons. Simulations and meta-analysis of the motivating cancer investigation demonstrate the efficacy of our proposal in high-dimensional causal inference by integrating multiple observational studies; using different weighting methods, we apply the B-MSMC approach to efficiently address confounding when integrating observational health studies with high-dimensional confounders.
翻译:通过整合多项观察性研究进行患者组间的比较性荟萃分析,需依赖估计的倾向得分以缓解混杂因素的不平衡性。然而,高维混杂因素给倾向得分估计带来了理论与实践的双重挑战。受一项整合七家医疗中心乳腺癌患者数据的动机性分析启发,本文致力于解决整合多源观察性数据集及提供全国可解释结果的相关难题。所提出的推断技术——贝叶斯混杂基序子矩阵法(B-MSMC),通过贝叶斯与频率学派方法的混合策略应对维数灾难。该方法利用非参数贝叶斯"中餐馆"过程消除高维混杂因素中的冗余,并发现潜在基序或低维结构。以这些基序作为潜在预测变量,可采用标准回归技术精准推断倾向得分,进而实现因果组间比较。通过模拟实验及对上述癌症研究的动机性荟萃分析,验证了本方法在整合多源观察性研究时进行高维因果推断的有效性;我们进一步采用不同的加权方法,应用B-MSMC高效处理整合高维混杂因素观察性健康研究中的混杂偏倚问题。