In practical applications, one often does not know the "true" structure of the underlying conditional quantile function, especially in the ultra-high dimensional setting. To deal with ultra-high dimensionality, quantile-adaptive marginal nonparametric screening methods have been recently developed. However, these approaches may miss important covariates that are marginally independent of the response, or may select unimportant covariates due to their high correlations with important covariates. To mitigate such shortcomings, we develop a conditional nonparametric quantile screening procedure (complemented by subsequent selection) for nonparametric additive quantile regression models. Under some mild conditions, we show that the proposed screening method can identify all relevant covariates in a small number of steps with probability approaching one. The subsequent narrowed best subset (via a modified Bayesian information criterion) also contains all the relevant covariates with overwhelming probability. The advantages of our proposed procedure are demonstrated through simulation studies and a real data example.
翻译:在实际应用中,人们通常不知道潜在条件分位数函数的“真实”结构,尤其是在超高维设定下。为应对超高维性,近期发展出了分位数自适应边缘非参数筛选方法。然而,这些方法可能会遗漏与响应变量边际独立的重要协变量,或因与重要协变量高度相关而选入不重要的协变量。为缓解这些不足,我们针对非参数可加分位数回归模型,提出了一种条件非参数分位数筛选程序(辅以后续选择)。在温和条件下,我们证明所提筛选方法能够以趋近于1的概率在少量步骤中识别所有相关协变量。后续通过修正贝叶斯信息准则选出的最优子集,也能以压倒性概率包含所有相关协变量。通过模拟研究与实际数据示例,我们验证了所提程序的优势。