We consider a bilevel learning framework for learning linear operators. In this framework, the learnable parameters are optimized via a loss function that also depends on the minimizer of a convex optimization problem (denoted lower-level problem). We utilize an iterative algorithm called `piggyback' to compute the gradient of the loss and minimizer of the lower-level problem. Given that the lower-level problem is solved numerically, the loss function and thus its gradient can only be computed inexactly. To estimate the accuracy of the computed hypergradient, we derive an a-posteriori error bound, which provides guides for setting the tolerance for the lower-level problem, as well as the piggyback algorithm. To efficiently solve the upper-level optimization, we also propose an adaptive method for choosing a suitable step-size. To illustrate the proposed method, we consider a few learned regularizer problems, such as training an input-convex neural network.
翻译:我们考虑一种用于学习线性算子的双层学习框架。在此框架中,可学习参数通过一个损失函数进行优化,该损失函数同时依赖于一个凸优化问题(称为下层问题)的最小化解。我们利用一种名为“piggyback”的迭代算法来计算损失函数的梯度以及下层问题的最小化解。考虑到下层问题是通过数值方法求解的,损失函数及其梯度只能被非精确地计算。为了估计所计算超梯度的准确性,我们推导了一个后验误差界,该误差界为设置下层问题以及piggyback算法的容差提供了指导。为了高效求解上层优化问题,我们还提出了一种自适应方法来选择合适的步长。为了说明所提出的方法,我们考虑了一些学习正则化器的问题,例如训练输入凸神经网络。