The use of non-probability data sources for statistical purposes has become increasingly popular in recent years, also in official statistics. However, statistical inference based on non-probability samples is made more difficult by nature of them being biased and not representative of the target population. In this paper we propose quantile balancing inverse probability weighting estimator (QBIPW) for non-probability samples. We use the idea of Harms and Duchesne (2006) which allows to include quantile information in the estimation process so known totals and distribution for auxiliary variables are being reproduced. We discuss the estimation of the QBIPW probabilities and its variance. Our simulation study has demonstrated that the proposed estimators are robust against model mis-specification and, as a result, help to reduce bias and mean squared error. Finally, we applied the proposed methods to estimate the share of vacancies aimed at Ukrainian workers in Poland using an integrated set of administrative and survey data about job vacancies.
翻译:近年来,非概率数据源在统计用途中的应用日益普及,在官方统计领域亦不例外。然而,由于非概率样本本身存在偏差且不能代表目标总体,基于此类样本的统计推断面临更大困难。本文针对非概率样本提出分位数平衡逆概率加权估计量(QBIPW)。我们借鉴Harms和Duchesne(2006)的研究思路,将分位数信息纳入估计过程,从而复现辅助变量的已知总量与分布特征。本文讨论了QBIPW概率的估计方法及其方差估计。模拟研究表明,所提出的估计量对模型设定偏误具有稳健性,能有效降低估计偏差与均方误差。最后,我们通过整合行政数据与调查数据构建职位空缺数据集,应用所提方法估算了波兰境内面向乌克兰劳动者的职位空缺比例。