This paper proposes a moment-assisted subsampling method which can improve the estimation efficiency of existing subsampling estimators. The motivation behind this approach stems from the fact that sample moments can be efficiently computed even if the sample size of the whole data set is huge. Through the generalized method of moments, this method incorporates informative sample moments of the whole data into the subsampling estimator. The moment-assisted estimator is asymptotically normal and has a smaller asymptotic variance compared to the corresponding estimator without incorporating sample moments of the whole data. The asymptotic variance of the moment-assisted estimator depends on the specific sample moments incorporated. Under the uniform subsampling probability, we derive the optimal moment that minimizes the resulting asymptotic variance in terms of Loewner order. Moreover, the moment-assisted subsampling estimator can be rapidly computed through one-step linear approximation. Simulation studies and a real data analysis were conducted to compare the proposed method with existing subsampling methods. Numerical results show that the moment-assisted subsampling method performs competitively across different settings. This suggests that incorporating the sample moments of the whole data can enhance existing subsampling technique.
翻译:摘要:本文提出了一种矩辅助子采样方法,该方法能够提升现有子采样估计量的估计效率。其动机源于全量数据集样本量巨大时,样本矩仍可被高效计算这一事实。通过广义矩方法,本方法将全量数据中包含信息量的样本矩整合进入子采样估计量。该矩辅助估计量具有渐近正态性,且与未整合全量数据样本矩的对应估计量相比,其渐近方差更小。矩辅助估计量的渐近方差取决于所整合的具体样本矩。在均匀子采样概率下,我们推导了基于Loewner序使渐近方差最小化的最优矩。此外,矩辅助子采样估计量可通过一步线性近似快速计算。通过模拟研究与实际数据分析,将所提方法与现有子采样方法进行了比较。数值结果表明,矩辅助子采样方法在不同场景下均表现出竞争力,这表明整合全量数据的样本矩可增强现有子采样技术。