We investigate the statistical behavior of gradient descent iterates with dropout in the linear regression model. In particular, non-asymptotic bounds for expectations and covariance matrices of the iterates are derived. In contrast with the widely cited connection between dropout and $\ell_2$-regularization in expectation, the results indicate a much more subtle relationship, owing to interactions between the gradient descent dynamics and the additional randomness induced by dropout. We also study a simplified variant of dropout which does not have a regularizing effect and converges to the least squares estimator.
翻译:我们研究了在线性回归模型中使用dropout的梯度下降迭代的统计行为。特别地,推导了迭代期望和协方差矩阵的非渐近界限。与广泛引用的关于dropout与$\ell_2$正则化在期望上相关的观点相反,结果表明两者之间的关系更为微妙,这归因于梯度下降动力学与dropout引入的额外随机性之间的相互作用。我们还研究了一种简化的dropout变体,该变体不具有正则化效果,并收敛到最小二乘估计量。