Bayesian multilevel multivariate logistic regression for superiority decision-making under observable treatment heterogeneity

In medical, social, and behavioral research we often encounter datasets with a multilevel structure and multiple correlated dependent variables. These data are frequently collected from a study population that distinguishes several subpopulations with different (i.e., heterogeneous) effects of an intervention. Despite the frequent occurrence of such data, methods to analyze them are less common and researchers often resort to either ignoring the multilevel and/or heterogeneous structure, analyzing only a single dependent variable, or a combination of these. These analysis strategies are suboptimal: Ignoring multilevel structures inflates Type I error rates, while neglecting the multivariate or heterogeneous structure masks detailed insights. To analyze such data comprehensively, the current paper presents a novel Bayesian multilevel multivariate logistic regression model. The clustered structure of multilevel data is taken into account, such that posterior inferences can be made with accurate error rates. Further, the model shares information between different subpopulations in the estimation of average and conditional average multivariate treatment effects. To facilitate interpretation, multivariate logistic regression parameters are transformed to posterior success probabilities and differences between them. A numerical evaluation compared our framework to less comprehensive alternatives and highlighted the need to model the multilevel structure: Treatment comparisons based on the multilevel model had targeted Type I error rates, while single-level alternatives resulted in inflated Type I errors. Further, the multilevel model was more powerful than a single-level model when the number of clusters was higher. ...

翻译：在医学、社会和行为研究中，我们经常遇到具有多层结构和多个相关因变量的数据集。这些数据通常采集自一个研究总体，该总体可区分出具有不同（即异质性）干预效应的若干子群体。尽管此类数据频繁出现，分析它们的方法却相对较少，研究人员往往只能忽略多层和/或异质性结构，仅分析单个因变量，或采用这些方法的组合。这些分析策略并非最优：忽略多层结构会膨胀第一类错误率，而忽视多元或异质性结构则会掩盖精细的洞察。为了全面分析此类数据，本文提出了一种新颖的贝叶斯多层多元逻辑回归模型。该模型考虑了多层数据的聚类结构，从而可以以准确的错误率进行后验推断。此外，模型在不同子群体之间共享信息，以估计平均和条件平均多元处理效应。为便于解释，多元逻辑回归参数被转换为后验成功概率及其差异。通过数值评估，我们将本框架与较不全面的替代方法进行了比较，强调了建模多层结构的必要性：基于多层模型的处理比较具有目标第一类错误率，而单层替代方法则导致第一类错误膨胀。此外，当聚类数量较高时，多层模型比单层模型具有更高的检验效能。……