Model misspecification is ubiquitous in data analysis because the data-generating process is often complex and mathematically intractable. Therefore, assessing estimation uncertainty and conducting statistical inference under a possibly misspecified working model is unavoidable. In such a case, classical methods such as bootstrap and asymptotic theory-based inference frequently fail since they rely heavily on the model assumptions. In this article, we provide a new bootstrap procedure, termed local residual bootstrap, to assess estimation uncertainty under model misspecification for generalized linear models. By resampling the residuals from the neighboring observations, we can approximate the sampling distribution of the statistic of interest accurately. Instead of relying on the score equations, the proposed method directly recreates the response variables so that we can easily conduct standard error estimation, confidence interval construction, hypothesis testing, and model evaluation and selection. It performs similarly to classical bootstrap when the model is correctly specified and provides a more accurate assessment of uncertainty under model misspecification, offering data analysts an easy way to guard against the impact of misspecified models. We establish desirable theoretical properties, such as the bootstrap validity, for the proposed method using the surrogate residuals. Numerical results and real data analysis further demonstrate the superiority of the proposed method.
翻译:模型误设在数据分析中普遍存在,因为数据生成过程通常复杂且数学上难以处理。因此,在可能误设的工作模型下评估估计不确定性并进行统计推断是不可避免的。在这种情况下,经典方法(如自助法和基于渐近理论的推断)常常失效,因为它们严重依赖于模型假设。本文针对广义线性模型,提出了一种称为局部残差自助法的自助程序,用于评估模型误设情况下的估计不确定性。通过对邻近观测值的残差进行重抽样,我们可以准确近似目标统计量的抽样分布。该方法不依赖于得分方程,而是直接重新生成响应变量,从而能够轻松进行标准误差估计、置信区间构建、假设检验以及模型评估与选择。当模型正确设定时,其性能与经典自助法相当;而在模型误设情况下,它能提供更准确的不确定性评估,为数据分析者提供了一种简便的方式来防范误设模型的影响。我们利用替代残差建立了所提方法的理想理论性质,如自助法的有效性。数值结果和实际数据分析进一步证明了所提方法的优越性。