Neural network (NN) denoisers are an essential building block in many common tasks, ranging from image reconstruction to image generation. However, the success of these models is not well understood from a theoretical perspective. In this paper, we aim to characterize the functions realized by shallow ReLU NN denoisers -- in the common theoretical setting of interpolation (i.e., zero training loss) with a minimal representation cost (i.e., minimal $\ell^2$ norm weights). First, for univariate data, we derive a closed form for the NN denoiser function, find it is contractive toward the clean data points, and prove it generalizes better than the empirical MMSE estimator at a low noise level. Next, for multivariate data, we find the NN denoiser functions in a closed form under various geometric assumptions on the training data: data contained in a low-dimensional subspace, data contained in a union of one-sided rays, or several types of simplexes. These functions decompose into a sum of simple rank-one piecewise linear interpolations aligned with edges and/or faces connecting training samples. We empirically verify this alignment phenomenon on synthetic data and real images.
翻译:神经网络(NN)去噪器是许多常见任务(从图像重建到图像生成)中的基本构建模块。然而,从理论角度理解这些模型为何能够成功仍不充分。本文旨在刻画浅层ReLU NN去噪器在常见的理论设置(即零训练损失的插值问题)下且具有最小表示代价(即最小化$\ell^2$范数权重)时所实现的函数。首先,针对单变量数据,我们推导出NN去噪器函数的闭式表达式,发现其对干净数据点具有收缩性,并证明在低噪声水平下其泛化性能优于经验MMSE估计器。其次,针对多变量数据,我们在训练数据的多种几何假设下(数据存在于低维子空间、数据存在于单侧射线并集、或多种单纯形中)得到了NN去噪器函数的闭式形式。这些函数分解为若干简单秩一逐段线性插值的和,这些插值与连接训练样本的边和/或面对齐。我们在合成数据和真实图像上经验性地验证了这一对齐现象。