In practical applications, one often does not know the "true" structure of the underlying conditional quantile function, especially in the ultra-high dimensional setting. To deal with ultra-high dimensionality, quantile-adaptive marginal nonparametric screening methods have been recently developed. However, these approaches may miss important covariates that are marginally independent of the response, or may select unimportant covariates due to their high correlations with important covariates. To mitigate such shortcomings, we develop a conditional nonparametric quantile screening procedure (complemented by subsequent selection) for nonparametric additive quantile regression models. Under some mild conditions, we show that the proposed screening method can identify all relevant covariates in a small number of steps with probability approaching one. The subsequent narrowed best subset (via a modified Bayesian information criterion) also contains all the relevant covariates with overwhelming probability. The advantages of our proposed procedure are demonstrated through simulation studies and a real data example.
翻译:在实际应用中,人们往往不清楚潜在的条件分位数函数的"真实"结构,尤其是在超高维设定下。针对超高维问题,近年来发展出了分位数自适应边际非参数筛选方法。然而,这些方法可能遗漏与响应变量边际独立的重要协变量,也可能因重要协变量之间的高相关性而选出不重要的变量。为缓解这些缺陷,我们针对非参数加性分位数回归模型,提出了一种条件非参数分位数筛选程序(辅以后续选择)。在温和条件下,我们证明所提出的筛选方法能够在少量步骤内以趋近于1的概率识别出所有相关协变量。后续通过修正贝叶斯信息准则筛选出的最优子集,也能以极高概率包含所有相关协变量。模拟研究和实际数据示例验证了我们所提出方法的优势。