Calibration weighting has been widely used to correct selection biases in non-probability sampling, missing data, and causal inference. The main idea is to calibrate the biased sample to the benchmark by adjusting the subject weights. However, hard calibration can produce enormous weights when an exact calibration is enforced on a large set of extraneous covariates. This article proposes a soft calibration scheme, in which the outcome and the selection indicator follow mixed-effects models. The scheme imposes an exact calibration on the fixed effects and an approximate calibration on the random effects. On the one hand, our soft calibration has an intrinsic connection with best linear unbiased prediction, which results in a more efficient estimation compared to hard calibration. On the other hand, soft calibration weighting estimation can be envisioned as penalized propensity score weight estimation, with the penalty term motivated by the mixed-effects structure. The asymptotic distribution and a valid variance estimator are derived for soft calibration. We demonstrate the superiority of the proposed estimator over other competitors in simulation studies and a real-data application.
翻译:校准加权已被广泛应用于纠正非概率抽样、缺失数据及因果推断中的选择偏差。其核心思想是通过调整个体权重,将有偏样本校准至基准分布。然而,当在大规模无关协变量集上强制实施精确校准时,硬校准可能产生极大权重。本文提出一种软校准方案,其中结果变量与选择指标服从混合效应模型。该方案对固定效应实施精确校准,对随机效应实施近似校准。一方面,软校准与最优线性无偏预测存在内在关联,相较于硬校准能实现更高效的估计;另一方面,软校准加权估计可视为惩罚倾向得分权重估计,其惩罚项源于混合效应结构。我们推导了软校准的渐近分布及有效方差估计量。通过模拟研究与实际数据应用,我们验证了所提出估计量相较于其他竞争方法的优越性。