Non-representative surveys are commonly used and widely available but suffer from selection bias that generally cannot be entirely eliminated using weighting techniques. Instead, we propose a Bayesian method to synthesize longitudinal representative unbiased surveys with non-representative biased surveys by estimating the degree of selection bias over time. We show using a simulation study that synthesizing biased and unbiased surveys together out-performs using the unbiased surveys alone, even if the selection bias may evolve in a complex manner over time. Using COVID-19 vaccination data, we are able to synthesize two large sample biased surveys with an unbiased survey to reduce uncertainty in now-casting and inference estimates while simultaneously retaining the empirical credible interval coverage. Ultimately, we are able to conceptually obtain the properties of a large sample unbiased survey if the assumed unbiased survey, used to anchor the estimates, is unbiased for all time-points.
翻译:非代表性调查应用广泛且易于获取,但其存在选择偏差问题,而传统的加权技术通常无法完全消除这种偏差。为此,我们提出一种贝叶斯方法,通过估计选择偏差随时间的变化程度,将具有代表性的无偏纵向调查与非代表性的有偏调查进行融合。通过模拟研究,我们证明即使选择偏差可能随时间以复杂方式演变,将有偏与无偏调查数据融合使用的效果仍优于单独使用无偏调查数据。利用COVID-19疫苗接种数据,我们将两个大样本有偏调查与一个无偏调查进行融合,在保持经验可信区间覆盖度的同时,有效降低了即时预测与推断估计的不确定性。最终,若作为估计基准的假设无偏调查在所有时间点均无偏,我们理论上能够获得大样本无偏调查的统计特性。