The maximum likelihood estimation is computationally demanding for large datasets, particularly when the likelihood function includes integrals. Subsampling can reduce the computational burden, but it often results in efficiency loss.This paper proposes a moment-assisted subsampling (MAS) method that can improve the estimation efficiency of existing subsampling-based maximum likelihood estimators.The motivation behind this approach stems from the fact that sample moments can be efficiently computed even if the sample size of the whole data set is huge.Through the generalized method of moments, the proposed method incorporates informative sample moments of the whole data. The MAS estimator can be computed rapidly and is asymptotically normal with a smaller asymptotic variance than the corresponding estimator without incorporating sample moments of the whole data. The asymptotic variance of the proposed estimator depends on the specific sample moments incorporated. We derive the optimal moment that minimizes the resulting asymptotic variance in terms of Loewner order. The proposed MAS estimator can achieve the same estimation efficiency as the whole data-based estimator when the optimal moment is incorporated. Numerical results demonstrate the promising performance of the proposed method in both estimation and computational efficiency compared with existing subsampling methods.
翻译:极大似然估计在处理大规模数据集时计算量巨大,尤其当似然函数包含积分时。子抽样虽可降低计算负担,但往往导致效率损失。本文提出一种矩辅助子抽样(MAS)方法,可提升现有基于子抽样的极大似然估计量的估计效率。该方法的动机源于:即使全样本数据量庞大,样本矩仍可高效计算。通过广义矩方法,本文方法整合了全数据中具有信息量的样本矩。MAS估计量可快速计算,且呈渐近正态分布,其渐近方差小于未整合全数据样本矩的对应估计量。所提估计量的渐近方差取决于所整合的具体样本矩。我们推导了洛纳序意义下使渐近方差最小化的最优矩。当整合最优矩时,所提MAS估计量可达到与全数据估计量相同的估计效率。数值结果表明,与现有子抽样方法相比,本文方法在估计效率和计算效率方面均展现出优越性能。