We study a deep linear network endowed with a structure. It takes the form of a matrix $X$ obtained by multiplying $K$ matrices (called factors and corresponding to the action of the layers). The action of each layer (i.e. a factor) is obtained by applying a fixed linear operator to a vector of parameters satisfying a constraint. The number of layers is not limited. Assuming that $X$ is given and factors have been estimated, the error between the product of the estimated factors and $X$ (i.e. the reconstruction error) is either the statistical or the empirical risk. In this paper, we provide necessary and sufficient conditions on the network topology under which a stability property holds. The stability property requires that the error on the parameters defining the factors (i.e. the stability of the recovered parameters) scales linearly with the reconstruction error (i.e. the risk). Therefore, under these conditions on the network topology, any successful learning task leads to stably defined features and therefore interpretable layers/network.In order to do so, we first evaluate how the Segre embedding and its inverse distort distances. Then, we show that any deep structured linear network can be cast as a generic multilinear problem (that uses the Segre embedding). This is the {\em tensorial lifting}. Using the tensorial lifting, we provide necessary and sufficient conditions for the identifiability of the factors (up to a scale rearrangement). We finally provide the necessary and sufficient condition called \NSPlong~(because of the analogy with the usual Null Space Property in the compressed sensing framework) which guarantees that the stability property holds. We illustrate the theory with a practical example where the deep structured linear network is a convolutional linear network. As expected, the conditions are rather strong but not empty. A simple test on the network topology can be implemented to test if the condition holds.
翻译:我们研究了一种具有特定结构的深度线性网络。该网络由矩阵$X$表示,其通过$K$个矩阵(称为因子,对应于各层的操作)相乘得到。每一层(即因子)的操作通过将固定线性算子应用于满足约束条件的参数向量来实现。网络层数不受限制。假设已知$X$且已估计出各因子,则估计因子乘积与$X$之间的误差(即重构误差)既可以是统计风险也可以是经验风险。本文给出了网络拓扑结构下稳定性条件成立的充要条件。该稳定性条件要求定义因子的参数误差(即恢复参数的稳定性)与重构误差(即风险)呈线性关系。因此,在这些网络拓扑条件下,任何成功的学习任务都将产生稳定定义的特征,从而获得可解释的层/网络。为此,我们首先评估了Segre嵌入及其逆变换对距离的扭曲程度。然后证明任意深度结构化线性网络均可转化为通用多线性问题(利用Segre嵌入),即**张量提升**。利用张量提升,我们给出了因子(允许尺度重排)可辨识性的充要条件。最后提出称为\NSPlong~的充要条件(因与压缩感知框架中常见的零空间性质类比而得名),该条件保证了稳定性成立。我们通过深度结构化线性网络为卷积线性网络的实例阐明了该理论。正如预期,这些条件虽然较强但并非空集。可通过网络拓扑结构的简单检验来判断条件是否成立。