We investigate to what extent it is possible to solve linear inverse problems with $ReLu$ networks. Due to the scaling invariance arising from the linearity, an optimal reconstruction function $f$ for such a problem is positive homogeneous, i.e., satisfies $f(\lambda x) = \lambda f(x)$ for all non-negative $\lambda$. In a $ReLu$ network, this condition translates to considering networks without bias terms. We first consider recovery of sparse vectors from few linear measurements. We prove that $ReLu$- networks with only one hidden layer cannot even recover $1$-sparse vectors, not even approximately, and regardless of the width of the network. However, with two hidden layers, approximate recovery with arbitrary precision and arbitrary sparsity level $s$ is possible in a stable way. We then extend our results to a wider class of recovery problems including low-rank matrix recovery and phase retrieval. Furthermore, we also consider the approximation of general positive homogeneous functions with neural networks. Extending previous work, we establish new results explaining under which conditions such functions can be approximated with neural networks. Our results also shed some light on the seeming contradiction between previous works showing that neural networks for inverse problems typically have very large Lipschitz constants, but still perform very well also for adversarial noise. Namely, the error bounds in our expressivity results include a combination of a small constant term and a term that is linear in the noise level, indicating that robustness issues may occur only for very small noise levels.
翻译:我们研究在何种程度上可以利用$ReLu$网络求解线性逆问题。由于线性性质带来的尺度不变性,此类问题的最优重构函数$f$具有正齐次性,即对所有非负$\lambda$满足$f(\lambda x) = \lambda f(x)$。在$ReLu$网络中,该条件等价于考虑无偏置项的网络。我们首先研究从少量线性测量中恢复稀疏向量的问题。我们证明仅含单隐层的$ReLu$网络甚至无法(包括近似地)恢复$1$-稀疏向量,且无论网络宽度如何。然而,通过双隐层网络,能够以任意精度和任意稀疏度$s$实现稳定的近似恢复。随后我们将结果推广至更广泛的恢复问题,包括低秩矩阵恢复和相位恢复。此外,我们还研究了用神经网络逼近一般正齐次函数的问题。在前期工作基础上,我们建立了新结果,阐明了此类函数可被神经网络逼近的条件。我们的结果也解释了此前看似矛盾的现象:用于逆问题的神经网络通常具有极大的Lipschitz常数,却仍能有效对抗对抗噪声。具体而言,我们所推导的表达性结果中的误差界包含一个较小的常数项和一个与噪声水平呈线性关系的项,这表明鲁棒性问题可能仅在噪声水平极低时出现。