Local Polynomial Regression (LPR) is a widely used nonparametric method for modeling complex relationships due to its flexibility and simplicity. It estimates a regression function by fitting low-degree polynomials to localized subsets of the data, weighted by proximity. However, traditional LPR is sensitive to outliers and high-leverage points, which can significantly affect estimation accuracy. This paper revisits the kernel function used to compute regression weights and proposes a novel framework that incorporates both predictor and response variables in the weighting mechanism. The focus of this work is a conditional density kernel that robustly estimates weights by mitigating the influence of outliers through localized density estimation. The proposed method is implemented in Python and is publicly available at https://github.com/yaniv-shulman/rsklpr. The population analysis quantifies the bias induced by density-based robust weighting, and the reported experiments show lower empirical bias than iterative robust LOWESS while remaining competitive with standard LOWESS. This advancement provides a promising extension to traditional LPR, opening new possibilities for robust regression applications.
翻译:局部多项式回归(LPR)是一种广泛应用于建模复杂关系的非参数方法,因其灵活性和简洁性而备受青睐。该方法通过将低次多项式拟合到数据局部子集(基于邻近性加权),来估计回归函数。然而,传统LPR对异常值和高杠杆点敏感,这可能显著影响估计精度。本文重新审视了用于计算回归权重的核函数,并提出了一种新颖框架,该框架将预测变量和响应变量同时纳入加权机制。本工作的核心是一种条件密度核,它通过局部密度估计来抑制异常值的影响,从而鲁棒地估计权重。所提方法已在Python中实现,并于https://github.com/yaniv-shulman/rsklpr 公开可用。总体分析量化了基于密度的鲁棒加权所产生的偏差,而报告的实验表明,该方法在经验偏差上低于迭代鲁棒LOWESS,同时与标准LOWESS保持竞争力。这一进展为传统LPR提供了有前景的扩展,为鲁棒回归应用开辟了新的可能性。