Non-probability data sources are increasingly considered in small area estimation, but inverse probability weighting (IPW) gives model-dependent domain estimators whose reliability may vary substantially across domains. Standard Fay-Herriot (FH) smoothing borrows strength across domains, yet it uses the supplied area-level variance estimates as if they fully described the uncertainty of the input estimators. This can be misleading when some domains have weak coverage, unstable weights, or poor auxiliary balance, since these features may indicate selection-bias risk not captured by the estimated variance alone. We propose a diagnostics-guided variance-inflated FH estimator for finite-population domain totals. The method starts from calibrated IPW domain estimators, summarizes their reliability through a small set of domain diagnostics, and introduces a mixture variance-inflation component in the FH observation equation. Domains whose diagnostics indicate weaker IPW information are thereby smoothed more strongly toward the area-level regression mean. A truth-known validation based on a pseudo-real population of Lithuanian business enterprises shows a substantial reduction in estimation error relative to calibrated IPW.
翻译:非概率数据源在小域估计中的应用日益增多,但逆概率加权法(IPW)会得到模型依赖的域估计量,其可靠性在不同域间可能存在显著差异。标准Fay-Herriot(FH)平滑方法可跨域借用信息强度,但其使用所提供区域级方差估计量时,假定这些估计量已完全描述输入估计量的不确定性。当某些域存在弱覆盖、权重不稳定或辅助变量平衡性差时,这种假设可能产生误导,因为这些特征可能表明选择偏差风险未被单独估计的方差所捕获。我们提出一种诊断引导的方差膨胀FH估计量,用于估计有限总体域总和。该方法从经校准的IPW域估计量出发,通过一组小型域诊断指标汇总其可靠性,并在FH观测方程中引入混合方差膨胀成分。对于诊断指标显示IPW信息较弱的域,其估计值会更强地向区域级回归均值方向平滑。基于立陶宛商业企业伪真实总体验证的已知真值实验表明,相较经校准的IPW方法,本方法能显著降低估计误差。