This paper proposes an adaptive penalized weighted mean regression for outlier detection of high-dimensional data. In comparison to existing approaches based on the mean shift model, the proposed estimators demonstrate robustness against outliers present in both response variables and/or covariates. By utilizing the adaptive Huber loss function, the proposed method is effective in high-dimensional linear models characterized by heavy-tailed and heteroscedastic error distributions. The proposed framework enables simultaneous and collaborative estimation of regression parameters and outlier detection. Under regularity conditions, outlier detection consistency and oracle inequalities of robust estimates in high-dimensional settings are established. Additionally, theoretical robustness properties, such as the breakdown point and a smoothed limiting influence function, are ascertained. Extensive simulation studies and a breast cancer survival data are used to evaluate the numerical performance of the proposed method, demonstrating comparable or superior variable selection and outlier detection capabilities.
翻译:本文提出一种自适应惩罚加权均值回归方法,用于高维数据的异常值检测。与基于均值漂移模型的现有方法相比,所提估计量对响应变量和/或协变量中存在的异常值具有稳健性。通过采用自适应Huber损失函数,所提方法在具有重尾和异方差误差分布的高维线性模型中表现有效。该框架能够同时协同估计回归参数与进行异常值检测。在正则性条件下,建立了高维场景下异常值检测一致性与稳健估计量的Oracle不等式。此外,还确定了理论稳健性性质,如击破点和平滑极限影响函数。通过大量模拟研究和乳腺癌生存数据评估了所提方法的数值性能,结果展示了其具有可比性或更优的变量选择与异常值检测能力。