We consider inverse problems where the conditional distribution of the observation ${\bf y}$ given the latent variable of interest ${\bf x}$ (also known as the forward model) is known, and we have access to a data set in which multiple instances of ${\bf x}$ and ${\bf y}$ are both observed. In this context, algorithm unrolling has become a very popular approach for designing state-of-the-art deep neural network architectures that effectively exploit the forward model. We analyze the statistical complexity of the gradient descent network (GDN), an algorithm unrolling architecture driven by proximal gradient descent. We show that the unrolling depth needed for the optimal statistical performance of GDNs is of order $\log(n)/\log(\varrho_n^{-1})$, where $n$ is the sample size, and $\varrho_n$ is the convergence rate of the corresponding gradient descent algorithm. We also show that when the negative log-density of the latent variable ${\bf x}$ has a simple proximal operator, then a GDN unrolled at depth $D'$ can solve the inverse problem at the parametric rate $O(D'/\sqrt{n})$. Our results thus also suggest that algorithm unrolling models are prone to overfitting as the unrolling depth $D'$ increases. We provide several examples to illustrate these results.
翻译:我们考虑逆问题,其中观测数据 ${\bf y}$ 关于潜在变量 ${\bf x}$(即前向模型)的条件分布已知,且我们能够获取一个数据集,其中包含 ${\bf x}$ 与 ${\bf y}$ 的多个实例观测值。在此背景下,算法展开已成为设计有效利用前向模型的最先进深度神经网络架构的流行方法。我们分析了梯度下降网络(GDN)的统计复杂度,这是一种由近端梯度下降驱动的算法展开架构。研究表明,GDN最优统计性能所需的展开深度阶数为 $\log(n)/\log(\varrho_n^{-1})$,其中 $n$ 为样本量,$\varrho_n$ 为对应梯度下降算法的收敛速率。此外,当潜在变量 ${\bf x}$ 的负对数密度具有简单近端算子时,深度为 $D'$ 的GDN展开能够以参数速率 $O(D'/\sqrt{n})$ 求解逆问题。因此,我们的结果还表明,随着展开深度 $D'$ 的增加,算法展开模型容易出现过拟合。文中通过多个实例说明这些结论。