We introduce the \textit{almost goodness-of-fit} test, a procedure to assess whether a (parametric) model provides a good representation of the probability distribution generating the observed sample. Specifically, given a distribution function $F$ and a parametric family $\mathcal{G}=\{ G(\boldsymbol{\theta}) : \boldsymbol{\theta} \in \Theta\}$, we consider the testing problem \[ H_0: \| F - G(\boldsymbol{\theta}_F) \|_p \geq \epsilon \quad \text{vs} \quad H_1: \| F - G(\boldsymbol{\theta}_F) \|_p < \epsilon, \] where $\epsilon>0$ is a margin of error and $G(\boldsymbol{\theta}_F)$ denotes a representative of $F$ within the parametric class. The approximate model is determined via an M-estimator of the parameters. %The objective is the approximate validation of a distribution or an entire parametric family up to a pre-specified threshold value. The methodology also quantifies the percentage improvement of the proposed model relative to a non-informative (constant) benchmark. The test statistic is the $\mathrm{L}^p$-distance between the empirical distribution function and that of the estimated model. We present two consistent, easy-to-implement, and flexible bootstrap schemes to carry out the test. The performance of the proposal is illustrated through simulation studies and analysis and real-data applications.
翻译:本文提出了\textit{几乎拟合优度}检验,这是一种评估(参数化)模型是否能够良好表征生成观测样本的概率分布的程序。具体而言,给定一个分布函数$F$和一个参数族$\mathcal{G}=\{ G(\boldsymbol{\theta}) : \boldsymbol{\theta} \in \Theta\}$,我们考虑如下检验问题:\[ H_0: \| F - G(\boldsymbol{\theta}_F) \|_p \geq \epsilon \quad \text{vs} \quad H_1: \| F - G(\boldsymbol{\theta}_F) \|_p < \epsilon, \] 其中$\epsilon>0$为误差容限,$G(\boldsymbol{\theta}_F)$表示参数类中代表$F$的分布。近似模型通过参数的M估计量确定。该方法还量化了所提出模型相对于非信息(常数)基准的改进百分比。检验统计量为经验分布函数与估计模型分布之间的$\mathrm{L}^p$距离。我们提出了两种一致、易于实现且灵活的自助法方案来执行该检验。通过模拟研究、分析以及真实数据应用,展示了该方法的性能。