Neural network (NN) denoisers are an essential building block in many common tasks, ranging from image reconstruction to image generation. However, the success of these models is not well understood from a theoretical perspective. In this paper, we aim to characterize the functions realized by shallow ReLU NN denoisers -- in the common theoretical setting of interpolation (i.e., zero training loss) with a minimal representation cost (i.e., minimal $\ell^2$ norm weights). First, for univariate data, we derive a closed form for the NN denoiser function, find it is contractive toward the clean data points, and prove it generalizes better than the empirical MMSE estimator at a low noise level. Next, for multivariate data, we find the NN denoiser functions in a closed form under various geometric assumptions on the training data: data contained in a low-dimensional subspace, data contained in a union of one-sided rays, or several types of simplexes. These functions decompose into a sum of simple rank-one piecewise linear interpolations aligned with edges and/or faces connecting training samples. We empirically verify this alignment phenomenon on synthetic data and real images.
翻译:神经网络(NN)去噪器是图像重建到图像生成等众多常见任务中的关键构建模块。然而,这些模型成功的理论原因尚未得到充分理解。本文旨在刻画浅层ReLU神经网络去噪器所实现的函数——在插值(即零训练损失)且表示代价最小(即权重的最小$\ell^2$范数)这一常见理论设定下。首先,对于单变量数据,我们推导出神经网络去噪器函数的闭式解,发现该函数向干净数据点收缩,并证明在低噪声水平下其泛化性能优于经验MMSE估计器。其次,对于多变量数据,我们在训练数据的若干几何假设下(包括数据位于低维子空间、数据位于单侧射线的并集、或几种单纯形)得到了神经网络去噪器函数的闭式解。这些函数可分解为一系列简单秩一分段线性插值的和,且插值方向与连接训练样本的边和/或面对齐。我们通过在合成数据和真实图像上的实验验证了这一对齐现象。