We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network. The parameters of the optimizer are then learned by training on a set of optimization tasks with the objective to perform minimization efficiently. Our innovation is a new neural network architecture, Optimus, for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a Transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization-based approaches, our formulation allows for conditioning across the dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms, as well as on the real world-task of physics-based visual reconstruction of articulated 3d human motion.
翻译:我们提出了一种新的学习优化方法,其中使用神经网络来表示优化器更新步骤的计算过程。通过在一组优化任务上训练优化器的参数,以高效执行最小化为目标,从而学习该优化器参数。我们的创新在于提出了一种名为Optimus的新型神经网络架构,该架构受经典BFGS算法启发用于学习优化器。与BFGS类似,我们通过秩一更新的累加来估计预条件矩阵,但采用基于Transformer的神经网络联合预测这些更新以及步长和方向。与近期几种基于学习优化的方法不同,我们的公式允许在目标问题参数空间的各个维度之间进行条件化处理,同时无需重新训练即可适用于可变维度的优化任务。我们在由传统用于评估优化算法的目标函数组成的基准测试中,以及在基于物理的关节式3D人体运动可视化这一实际任务中,展示了我们方法的优势。