We present csSampling, an R package for estimation of Bayesian models for data collected from complex survey samples. csSampling combines functionality from the probabilistic programming language Stan (via the rstan and brms R packages) and the handling of complex survey data from the survey R package. Under this approach, the user creates a survey-weighted model in brms or provides a custom weighted model via rstan. Survey design information is provided via the svydesign function of the survey package. The cs_sampling function of csSampling estimates the weighted stan model and provides an asymptotic covariance correction for model mis-specification due to using survey sampling weights as plug-in values in the likelihood. This is often known as a ``design effect'' which is the ratio between the variance from a complex survey sample and a simple random sample of the same size. The resulting adjusted posterior draws can then be used for the usual Bayesian inference while also achieving frequentist properties of asymptotic consistency and correct uncertainty (e.g. coverage).
翻译:摘要:本文介绍csSampling,一个用于基于复杂调查样本数据估计贝叶斯模型的R语言包。该包整合了概率编程语言Stan(通过rstan和brms包)的功能,并利用survey包处理复杂调查数据。用户可通过brms创建含调查权重的模型,或通过rstan提供自定义加权模型。调查设计信息通过survey包的svydesign函数定义。csSampling包的cs_sampling函数可估计加权Stan模型,并对因将调查抽样权重作为似然函数代入值而导致的模型设定误差进行渐近协方差修正。这一修正通常被称为"设计效应",即复杂调查样本方差与同规模简单随机样本方差的比值。修正后的后验抽样结果既可进行标准贝叶斯推断,又具备渐近一致性和正确不确定性(如覆盖度)等频率学派性质。