We consider the robust estimation of the parameters of multivariate Gaussian linear regression models. To this aim we consider robust version of the usual (Mahalanobis) least-square criterion, with or without Ridge regularization. We introduce two methods each considered contrast: (i) online stochastic gradient descent algorithms and their averaged versions and (ii) offline fix-point algorithms. Under weak assumptions, we prove the asymptotic normality of the resulting estimates. Because the variance matrix of the noise is usually unknown, we propose to plug a robust estimate of it in the Mahalanobis-based stochastic gradient descent algorithms. We show, on synthetic data, the dramatic gain in terms of robustness of the proposed estimates as compared to the classical least-square ones. Well also show the computational efficiency of the online versions of the proposed algorithms. All the proposed algorithms are implemented in the R package RobRegression available on CRAN.
翻译:我们考虑多元高斯线性回归模型参数的稳健估计。为此,我们引入常规(马氏)最小二乘准则的稳健版本,包含或不含Ridge正则化。针对每种对比准则,我们提出两种方法:(i)在线随机梯度下降算法及其平均版本;(ii)离线不动点算法。在弱假设条件下,我们证明了所得估计量的渐近正态性。由于噪声的方差矩阵通常未知,我们建议在马氏随机梯度下降算法中代入其稳健估计。通过合成数据实验,我们展示了所提估计量相比经典最小二乘估计在稳健性上的显著提升。同时,我们还展示了所提算法在线版本的计算效率。所有提出的算法均已实现于CRAN上的R包RobRegression中。