Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems. This work motivates the use of potential-based reward shaping to reduce the computational burden of each RL sub-problem. This work serves as a proof-of-concept and we hope will inspire future developments towards computationally efficient IRL.
翻译:逆强化学习(Inverse Reinforcement Learning, IRL)在计算上具有挑战性,现有常用方法需要求解多个强化学习(Reinforcement Learning, RL)子问题。本研究提出利用基于势能的奖励塑造方法,以降低每个RL子问题的计算负担。本工作作为概念验证,期望能为未来实现计算高效的逆强化学习研究提供启发。