In this paper we consider the problem learning of variational models in the context of supervised learning via risk minimization. Our goal is to provide a deeper understanding of the two approaches of learning of variational models via bilevel optimization and via algorithm unrolling. The former considers the variational model as a lower level optimization problem below the risk minimization problem, while the latter replaces the lower level optimization problem by an algorithm that solves said problem approximately. Both approaches are used in practice, but, unrolling is much simpler from a computational point of view. To analyze and compare the two approaches, we consider a simple toy model, and compute all risks and the respective estimators explicitly. We show that unrolling can be better than the bilevel optimization approach, but also that the performance of unrolling can depend significantly on further parameters, sometimes in unexpected ways: While the stepsize of the unrolled algorithm matters a lot, the number of unrolled iterations only matters if the number is even or odd, and these two cases are notably different.
翻译:在本文中,我们考虑在监督学习背景下通过风险最小化学习变分模型的问题。我们的目标是深入理解两种变分模型学习方法:通过双层优化学习和通过算法展开学习。前者将变分模型视为风险最小化问题下的低层优化问题,而后者则用近似求解该问题的算法替换低层优化问题。这两种方法在实践中都有应用,但从计算角度来看,展开方法要简单得多。为分析并比较这两种方法,我们考虑一个简单的玩具模型,并显式计算所有风险及相应的估计量。我们证明,展开方法可能优于双层优化方法,但展开方法的性能也可能显著依赖于其他参数,有时甚至以意想不到的方式:尽管展开算法的步长影响很大,但展开迭代次数仅当其为偶数或奇数时才产生影响,且这两种情况存在显著差异。