Most existing tests in the literature for model checking do not work in high dimension settings due to challenges arising from the "curse of dimensionality", or dependencies on the normality of parameter estimators. To address these challenges, we proposed a new goodness of fit test based on random projections for generalized linear models, when the dimension of covariates may substantially exceed the sample size. The tests only require the convergence rate of parameter estimators to derive the limiting distribution. The growing rate of the dimension is allowed to be of exponential order in relation to the sample size. As random projection converts covariates to one-dimensional space, our tests can detect the local alternative departing from the null at the rate of $n^{-1/2}h^{-1/4}$ where $h$ is the bandwidth, and $n$ is the sample size. This sensitive rate is not related to the dimension of covariates, and thus the "curse of dimensionality" for our tests would be largely alleviated. An interesting and unexpected result is that for randomly chosen projections, the resulting test statistics can be asymptotic independent. We then proposed combination methods to enhance the power performance of the tests. Detailed simulation studies and a real data analysis are conducted to illustrate the effectiveness of our methodology.
翻译:现有文献中的大多数模型检验方法由于“维数灾难”带来的挑战或对参数估计量正态性的依赖,在高维场景下难以适用。为解决这些问题,我们针对协变量维数可能远超样本量的广义线性模型,提出了一种基于随机投影的拟合优度检验方法。该检验仅需参数估计量的收敛速率即可推导极限分布,且允许维数增长率相对于样本量呈指数阶。由于随机投影将协变量转换至一维空间,我们的检验能够检测以$n^{-1/2}h^{-1/4}$速率偏离原假设的局部备择假设,其中$h$为带宽,$n$为样本量。这一敏感速率与协变量维数无关,从而显著缓解了检验中的“维数灾难”问题。一个有趣且意外的结果是:对于随机选取的投影,所得检验统计量具有渐近独立性。我们进而提出了组合方法以提升检验的功效表现。通过详尽的模拟研究和实际数据分析,验证了所提方法的有效性。