This paper develops an updatable inverse probability weighting (UIPW) estimation for the generalized linear models with response missing at random in streaming data sets. A two-step online updating algorithm is provided for the proposed method. In the first step we construct an updatable estimator for the parameter in propensity function and hence obtain an updatable estimator of the propensity function; in the second step we propose an UIPW estimator with the inverse of the updating propensity function value at each observation as the weight for estimating the parameter of interest. The UIPW estimation is universally applicable due to its relaxation on the constraint on the number of data batches. It is shown that the proposed estimator is consistent and asymptotically normal with the same asymptotic variance as that of the oracle estimator, and hence the oracle property is obtained. The finite sample performance of the proposed estimator is illustrated by the simulation and real data analysis. All numerical studies confirm that the UIPW estimator performs as well as the batch learner.
翻译:本文针对流式数据中响应随机缺失的广义线性模型,提出了一种可更新的逆概率加权(UIPW)估计方法。该方法采用两步在线更新算法:第一步构建倾向性函数参数的可更新估计量,进而获得倾向性函数的可更新估计;第二步以每个观测点更新后的倾向性函数值的倒数作为权重,提出参数兴趣量的UIPW估计量。由于放宽了对数据批次数量的约束,UIPW估计具有普适适用性。理论证明所提估计量具有相合性和渐近正态性,其渐近方差与理想估计量(oracle estimator)相同,从而实现了理想性(oracle property)。通过模拟实验和真实数据分析验证了所提估计量的有限样本性能,所有数值研究均表明UIPW估计量与批处理学习器(batch learner)性能相当。