Selection bias poses a challenge to statistical inference validity in non-probability surveys. This study compared estimates of the first-dose COVID-19 vaccination rates among Indian adults in 2021 from a large non-probability survey, COVID-19 Trends and Impact Survey (CTIS), and a small probability survey, the Center for Voting Options and Trends in Election Research (CVoter), against benchmark data from the COVID Vaccine Intelligence Network (CoWIN). Notably, CTIS exhibits a larger estimation error (0.39) compared to CVoter (0.16). Additionally, we investigated the estimation accuracy of the CTIS when using a relative scale and found a significant increase in the effective sample size by altering the estimand from the overall vaccination rate. These results suggest that the big data paradox can manifest in countries beyond the US and it may not apply to every estimand of interest.
翻译:选择偏差对非概率调查的统计推断有效性构成挑战。本研究以印度2021年成年人首剂COVID-19疫苗接种率为对象,将大规模非概率调查——COVID-19趋势与影响调查(CTIS)与小型概率调查——选举研究中心民意投票选项与趋势调查(CVoter)的估计结果,与COVID疫苗情报网络(CoWIN)的基准数据进行对比。值得注意的是,CTIS的估计误差(0.39)显著大于CVoter(0.16)。此外,我们分析了CTIS在使用相对尺度时的估计精度,发现通过将估计量从总体接种率调整为其他形式,有效样本量显著增加。这些结果表明:大数据悖论可能在美国以外的国家出现,且并非适用于所有目标估计量。