We show the convergence of Wasserstein inverse reinforcement learning (WIRL) for multi-objective optimizations with the projective subgradient method by formulating an inverse problem of the optimization problem that is equivalent to WIRL for multi-objective optimizations. In addition, we prove convergence of inverse reinforcement learning (maximum entropy inverse reinforcement learning, guid cost learning) for multi-objective optimization with the projective subgradient method.
翻译:本文通过构建与多目标优化的Wasserstein逆向强化学习等价的反问题,结合投影次梯度方法,证明了Wasserstein逆向强化学习(WIRL)在多目标优化中的收敛性。此外,我们还证明了逆向强化学习(最大熵逆向强化学习、引导代价学习)结合投影次梯度方法在多目标优化中的收敛性。