Nonprobability (convenience) samples are increasingly sought to stabilize estimations for one or more population variables of interest that are performed using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in the convenience sample is different from the population. A recent set of approaches estimates conditional (on sampling design predictors) inclusion probabilities for convenience sample units by specifying reference sample-weighted pseudo likelihoods. This paper introduces a novel approach that derives the propensity score for the observed sample as a function of conditional inclusion probabilities for the reference and convenience samples as our main result. Our approach allows specification of an exact likelihood for the observed sample. We construct a Bayesian hierarchical formulation that simultaneously estimates sample propensity scores and both conditional and reference sample inclusion probabilities for the convenience sample units. We compare our exact likelihood with the pseudo likelihoods in a Monte Carlo simulation study.
翻译:非概率(便利)样本日益被用于通过增加有效样本量来稳定基于随机调查(参考)样本对总体目标变量进行的估计。由于便利样本中目标变量的分布与总体分布存在差异,基于便利样本推导的总体量估计通常会产生偏差。近期一系列方法通过指定参考样本加权伪似然来估计便利样本单元的条件(基于抽样设计预测变量)包含概率。本文提出一种新方法,以参考样本和便利样本的条件包含概率作为核心结果,推导观测样本的倾向得分函数。我们的方法允许为观测样本指定精确似然,并构建了贝叶斯分层模型,该模型可同时估计样本倾向得分以及便利样本单元的条件包含概率和参考样本包含概率。在蒙特卡洛模拟研究中,我们将精确似然与伪似然方法进行了比较。