The American Community Survey (ACS) Public Use Microdata Sample (PUMS) provides access to a wide range of unit-level survey data consisting of correlated Gaussian and binomial distributed survey responses along with associated survey weights. As such, we propose a Bayesian hierarchical framework for jointly modeling unit-level Gaussian and binomial survey data. The model introduces a shared area-level random effect to capture dependence across responses. Informative sampling is addressed using a pseudo-likelihood construction, and Polya-Gamma data augmentation provides an efficient conjugate Gibbs sampler, enabling scalable inference for large survey datasets. Through empirical simulations based on ACS PUMS data, we show that the joint model achieves notable reductions in mean squared error and improved interval scores compared to univariate and design-based estimators. Applying the method to the 2023 Illinois PUMS data, we find that the joint model yields small-area estimates similar to those from the univariate model and the Horvitz-Thompson estimator, but with smaller posterior variances. The computational cost associated with the joint model is also comparable to that of the univariate binomial model. Combined with the empirical simulation results, these findings demonstrate the practical advantages of the proposed approach.
翻译:美国社区调查(ACS)公众使用微样本数据(PUMS)提供了获取广泛单元级调查数据的途径,这些数据包含相关的Gaussian和Binomial分布调查响应以及相应的调查权重。为此,我们提出一个贝叶斯分层框架,用于联合建模单元级Gaussian和Binomial调查数据。该模型引入一个共享的区域级随机效应,以捕捉响应间的依赖性。通过伪似然构建处理信息性抽样,并利用Polya-Gamma数据增广实现高效的共轭Gibbs采样器,从而能够对大规模调查数据集进行可扩展推断。基于ACS PUMS数据的实证模拟表明,与单变量和基于设计的估计量相比,该联合模型在均方误差上实现了显著降低,并改进了区间分数。将所提方法应用于2023年伊利诺伊州PUMS数据时,我们发现联合模型产生的小区域估计与单变量模型及Horvitz-Thompson估计量相似,但后验方差更小。该联合模型的计算成本也与单变量Binomial模型相当。结合实证模拟结果,这些发现展示了所提方法的实际优势。