We study geometric properties of the gradient flow for learning deep linear convolutional networks. For linear fully connected networks, it has been shown recently that the corresponding gradient flow on parameter space can be written as a Riemannian gradient flow on function space (i.e., on the product of weight matrices) if the initialization satisfies a so-called balancedness condition. We establish that the gradient flow on parameter space for learning linear convolutional networks can be written as a Riemannian gradient flow on function space regardless of the initialization. This result holds for $D$-dimensional convolutions with $D \geq 2$, and for $D =1$ it holds if all so-called strides of the convolutions are greater than one. The corresponding Riemannian metric depends on the initialization.
翻译:我们研究学习深度线性卷积网络的梯度流的几何性质。对于线性全连接网络,最近研究表明,若初始化满足所谓的平衡条件,则参数空间上的相应梯度流可表述为函数空间(即权重矩阵乘积空间)上的黎曼梯度流。我们证明,无论初始化如何,学习线性卷积网络的参数空间上的梯度流均可表示为函数空间上的黎曼梯度流。该结论对于 $D \geq 2$ 维卷积成立,对于 $D=1$ 维卷积,当所有所谓的卷积步长大于1时亦成立。相应的黎曼度量依赖于初始化。