Diagonal linear networks are neural networks with linear activation and diagonal weight matrices. Their theoretical interest is that their implicit regularization can be rigorously analyzed: from a small initialization, the training of diagonal linear networks converges to the linear predictor with minimal 1-norm among minimizers of the training loss. In this paper, we deepen this analysis showing that the full training trajectory of diagonal linear networks is closely related to the lasso regularization path. In this connection, the training time plays the role of an inverse regularization parameter. Both rigorous results and simulations are provided to illustrate this conclusion. Under a monotonicity assumption on the lasso regularization path, the connection is exact while in the general case, we show an approximate connection.
翻译:对角线性网络是采用线性激活函数且权重矩阵为对角的神经网络。其理论价值在于隐式正则化可被严格分析:从较小的初始值出发,对角线性网络的训练将收敛至训练损失最小化器中具有最小1-范数的线性预测器。本文深化了这一分析,证明对角线性网络的完整训练轨迹与Lasso正则化路径密切相关。在此关联中,训练时间扮演着逆正则化参数的角色。我们通过严谨理论推导和仿真实验共同验证了这一结论。在Lasso正则化路径满足单调性假设时,该关联具有精确性;而在一般情况下,我们证明其具有近似关联。