The maximum likelihood estimation is computationally demanding for large datasets, particularly when the likelihood function includes integrals. Subsampling can reduce the computational burden, but it typically results in efficiency loss. This paper proposes a moment-assisted subsampling (MAS) method that can improve the estimation efficiency of existing subsampling-based maximum likelihood estimators. The motivation behind this approach stems from the fact that sample moments can be efficiently computed even if the sample size of the whole data set is huge. Through the generalized method of moments, the proposed method incorporates informative sample moments of the whole data. The MAS estimator can be computed rapidly and is asymptotically normal with a smaller asymptotic variance than the corresponding estimator without incorporating sample moments of the whole data. The asymptotic variance of the MAS estimator depends on the specific sample moments incorporated. We derive the optimal moment that minimizes the resulting asymptotic variance in terms of Loewner order. Simulation studies and real data analysis were conducted to compare the proposed method with existing subsampling methods. Numerical results demonstrate the promising performance of the MAS method across various scenarios.
翻译:对于大规模数据集,最大似然估计在计算上要求很高,特别是当似然函数包含积分时。子采样可以减轻计算负担,但通常会导致效率损失。本文提出了一种矩辅助子采样(MAS)方法,该方法能够提高现有基于子采样的最大似然估计器的估计效率。该方法的动机源于这样一个事实:即使整个数据集的样本量巨大,样本矩也可以被高效计算。通过广义矩估计方法,所提出的方法整合了整个数据的有效样本矩。MAS估计量可以快速计算,并且是渐近正态的,其渐近方差小于未整合整个数据样本矩的相应估计量。MAS估计量的渐近方差取决于所整合的具体样本矩。我们推导出了在Loewner序意义下最小化所得渐近方差的最优矩。通过模拟研究和实际数据分析,将所提出的方法与现有子采样方法进行了比较。数值结果表明,MAS方法在各种场景下均表现出优异的性能。