We show the convergence of Wasserstein inverse reinforcement learning for multi-objective optimizations with the projective subgradient method by formulating an inverse problem of the multi-objective optimization problem. In addition, we prove convergence of inverse reinforcement learning (maximum entropy inverse reinforcement learning, guided cost learning) with gradient descent and the projective subgradient method.
翻译:我们通过构建多目标优化问题的逆问题,证明了基于投影次梯度法的Wasserstein逆向强化学习在多目标优化中的收敛性。此外,我们证明了采用梯度下降法和投影次梯度法的逆向强化学习(最大熵逆向强化学习、引导成本学习)的收敛性。