Diagonal linear networks are neural networks with linear activation and diagonal weight matrices. Their theoretical interest is that their implicit regularization can be rigorously analyzed: from a small initialization, the training of diagonal linear networks converges to the linear predictor with minimal 1-norm among minimizers of the training loss. In this paper, we deepen this analysis showing that the full training trajectory of diagonal linear networks is closely related to the lasso regularization path. In this connection, the training time plays the role of an inverse regularization parameter. Both rigorous results and simulations are provided to illustrate this conclusion. Under a monotonicity assumption on the lasso regularization path, the connection is exact while in the general case, we show an approximate connection.
翻译:对角线性网络是具有线性激活函数和对角权重矩阵的神经网络。其理论意义在于其隐式正则化行为可被严格分析:从小初始化出发,对角线性网络的训练过程会收敛至训练损失最小化解中具有最小1-范数的线性预测器。本文深化了该分析,证明对角线性网络的完整训练轨迹与Lasso正则化路径密切相关。在此关联中,训练时间扮演了逆正则化参数的角色。我们通过严格证明与数值模拟共同验证该结论。在Lasso正则化路径满足单调性假设的条件下,该关联是精确的;而在一般情形下,我们证明了近似关联的存在。