Deep unrolling, or unfolding, is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network. However, the convergence guarantees and generalizability of the unrolled networks are still open theoretical problems. To tackle these problems, we provide deep unrolled architectures with a stochastic descent nature by imposing descending constraints during training. The descending constraints are forced layer by layer to ensure that each unrolled layer takes, on average, a descent step toward the optimum during training. We theoretically prove that the sequence constructed by the outputs of the unrolled layers is then guaranteed to converge for unseen problems, assuming no distribution shift between training and test problems. We also show that standard unrolling is brittle to perturbations, and our imposed constraints provide the unrolled networks with robustness to additive noise and perturbations. We numerically assess unrolled architectures trained under the proposed constraints in two different applications, including the sparse coding using learnable iterative shrinkage and thresholding algorithm (LISTA) and image inpainting using proximal generative flow (GLOW-Prox), and demonstrate the performance and robustness benefits of the proposed method.
翻译:深度展开是一种新兴的基于学习的优化方法,它将截断的迭代算法展开为可训练神经网络的层。然而,展开网络的收敛性保证与泛化能力仍是开放的理论问题。为解决这些问题,我们通过在训练中施加下降约束,为深度展开架构赋予随机下降特性。这些下降约束被逐层强制施加,以确保每个展开层在训练期间平均朝向最优解执行下降步骤。从理论上我们证明,在训练与测试问题之间不存在分布漂移的假设下,由展开层输出构成的序列对于未见问题具有收敛保证。我们还表明,标准展开方法对扰动具有脆弱性,而我们施加的约束为展开网络提供了对加性噪声与扰动的鲁棒性。我们在两个不同应用中数值评估了在所提约束下训练的展开架构,包括使用可学习迭代收缩阈值算法(LISTA)的稀疏编码以及使用近端生成流(GLOW-Prox)的图像修复,并展示了所提方法在性能与鲁棒性方面的优势。