Multilevel regression and poststratification (MRP) is a popular method for addressing selection bias in subgroup estimation, with broad applications across fields from social sciences to public health. In this paper, we examine the inferential validity of MRP in finite populations, exploring the impact of poststratification and model specification. The success of MRP relies heavily on the availability of auxiliary information that is strongly related to the outcome. To enhance the fitting performance of the outcome model, we recommend modeling the inclusion probabilities conditionally on auxiliary variables and incorporating flexible functions of estimated inclusion probabilities as predictors in the mean structure. We present a statistical data integration framework that offers robust inferences for probability and nonprobability surveys, addressing various challenges in practical applications. Our simulation studies indicate the statistical validity of MRP, which involves a tradeoff between bias and variance, with greater benefits for subgroup estimates with small sample sizes, compared to alternative methods. We have applied our methods to the Adolescent Brain Cognitive Development (ABCD) Study, which collected information on children across 21 geographic locations in the U.S. to provide national representation, but is subject to selection bias as a nonprobability sample. We focus on the cognition measure of diverse groups of children in the ABCD study and show that the use of auxiliary variables affects the findings on cognitive performance.
翻译:多级回归与事后分层(Multilevel Regression and Poststratification, MRP)是一种在处理子群体估计中选择性偏差时广泛使用的方法,在社会科学、公共卫生等领域均有重要应用。本文旨在探讨有限总体中MRP的推断有效性,分析事后分层与模型设定的影响。MRP的成功高度依赖于与结局变量强相关的辅助信息的可用性。为提升结局模型的拟合性能,我们建议基于辅助变量对包含概率进行条件建模,并将估计所得包含概率的灵活函数作为预测因子纳入均值结构。我们提出一种统计数据整合框架,可为概率调查与非概率调查提供稳健推断,应对实际应用中的多样挑战。模拟研究表明,MRP在偏差与方差之间需权衡取舍,相较于其他方法,其对样本量较小的子群体估计具有更显著的改进效果。我们将所提方法应用于青少年脑认知发展(ABCD)研究——该研究收集美国21个地理位置的儿童数据以实现全国代表性,但其作为非概率样本存在选择性偏差。我们聚焦ABCD研究中不同儿童群体的认知能力测量指标,结果表明辅助变量的使用会影响认知表现的分析结论。