The random feature (RF) approach is a well-established and efficient tool for scalable kernel methods, but existing literature has primarily focused on kernel ridge regression with random features (KRR-RF), which has limitations in handling heterogeneous data with heavy-tailed noises. This paper presents a generalization study of kernel quantile regression with random features (KQR-RF), which accounts for the non-smoothness of the check loss in KQR-RF by introducing a refined error decomposition and establishing a novel connection between KQR-RF and KRR-RF. Our study establishes the capacity-dependent learning rates for KQR-RF under mild conditions on the number of RFs, which are minimax optimal up to some logarithmic factors. Importantly, our theoretical results, utilizing a data-dependent sampling strategy, can be extended to cover the agnostic setting where the target quantile function may not precisely align with the assumed kernel space. By slightly modifying our assumptions, the capacity-dependent error analysis can also be applied to cases with Lipschitz continuous losses, enabling broader applications in the machine learning community. To validate our theoretical findings, simulated experiments and a real data application are conducted.
翻译:随机特征(RF)方法是可扩展核方法中一种成熟且高效的工具,但现有文献主要集中于带随机特征的核岭回归(KRR-RF),其在处理具有重尾噪声的异质数据时存在局限。本文对带随机特征的核分位数回归(KQR-RF)进行了泛化性研究,通过引入精细的误差分解并建立KQR-RF与KRR-RF之间的新颖联系,解决了KQR-RF中检验损失的非光滑性问题。我们的研究在关于RF数量的温和条件下,建立了KQR-RF的容量依赖学习速率,这些速率在忽略对数因子意义下是最小极大最优的。重要的是,我们利用数据依赖采样策略的理论结果可推广至目标分位数函数可能不完全符合假设核空间的无先验设定场景。通过略微修改假设条件,容量依赖误差分析也可适用于具有Lipschitz连续损失函数的情形,从而为机器学习社区提供更广泛的应用前景。为验证理论发现,我们进行了模拟实验与真实数据应用。