In this paper we consider the problem of learning variational models in the context of supervised learning via risk minimization. Our goal is to provide a deeper understanding of the two approaches of learning of variational models via bilevel optimization and via algorithm unrolling. The former considers the variational model as a lower level optimization problem below the risk minimization problem, while the latter replaces the lower level optimization problem by an algorithm that solves said problem approximately. Both approaches are used in practice, but unrolling is much simpler from a computational point of view. To analyze and compare the two approaches, we consider a simple toy model, and compute all risks and the respective estimators explicitly. We show that unrolling can be better than the bilevel optimization approach, but also that the performance of unrolling can depend significantly on further parameters, sometimes in unexpected ways: While the stepsize of the unrolled algorithm matters a lot (and learning the stepsize gives a significant improvement), the number of unrolled iterations plays a minor role.
翻译:本文研究了在监督学习中通过风险最小化来学习变分模型的问题。我们的目标是更深入地理解两种学习方法:通过双层优化和通过算法展开来学习变分模型。前者将变分模型视为风险最小化问题下的下层优化问题,而后者则用一个近似求解该下层优化问题的算法来替代它。这两种方法在实践中均有应用,但从计算角度来看,展开方法要简单得多。为了分析和比较这两种方法,我们考虑了一个简单的玩具模型,并显式地计算了所有风险及其对应的估计量。结果表明,展开方法可能优于双层优化方法,但展开方法的性能也可能高度依赖于其他参数,有时甚至以意料之外的方式:虽然展开算法的步长至关重要(学习步长能带来显著改进),但展开迭代次数的影响则相对较小。