In this paper, we propose a novel approach for tackling the obstacles of empirical likelihood in the face of massive data, which is called split sample mean empirical likelihood (SSMEL), our approach provides a unique perspective for solving big data problems. We show that the SSMEL estimator has the same estimation efficiency as the empirical likelihood estimator with the full dataset, and maintains the important statistical property of Wilks' theorem, allowing our proposed approach to be used for statistical inference without estimating the covariance matrix. This effectively tackles the hurdle of the Divide and Conquer (DC) algorithm for statistical inference. We further illustrate the proposed approach via simulation studies and real data analysis.
翻译:本文提出一种解决海量数据下经验似然障碍的新方法,称为分割样本均值经验似然(SSMEL)。该方法为解决大数据问题提供了独特视角。我们证明SSMEL估计量具有与基于完整数据集的经典经验似然估计量相同的估计效率,并保留了Wilks定理的重要统计性质,使得所提方法无需估计协方差矩阵即可用于统计推断,有效克服了分而治之(DC)算法在统计推断中的障碍。进一步通过模拟研究与实际数据分析验证了所提方法的有效性。