In inverse reinforcement learning (IRL), the central objective is to infer underlying reward functions from observed expert behaviors in a way that not only explains the given data but also generalizes to unseen scenarios. This ensures robustness against reward ambiguity where multiple reward functions can equally explain the same expert behaviors. While significant efforts have been made in addressing this issue, current methods often face challenges with high-dimensional problems and lack a geometric foundation. This paper harnesses the optimal transport (OT) theory to provide a fresh perspective on these challenges. By utilizing the Wasserstein distance from OT, we establish a geometric framework that allows for quantifying reward ambiguity and identifying a central representation or centroid of reward functions. These insights pave the way for robust IRL methodologies anchored in geometric interpretations, offering a structured approach to tackle reward ambiguity in high-dimensional settings.
翻译:在逆强化学习(IRL)中,核心目标是从观察到的专家行为中推断潜在的奖励函数,该函数不仅能够解释给定数据,还能泛化到未见过的场景。这确保了对抗奖励模糊性的鲁棒性,其中多个奖励函数可以同样解释相同的专家行为。尽管已有大量研究致力于解决这一问题,但现有方法在高维问题中常面临挑战,且缺乏几何基础。本文利用最优传输(OT)理论,为这些挑战提供了全新视角。通过采用OT中的Wasserstein距离,我们构建了一个几何框架,能够量化奖励模糊性并识别奖励函数的中心表示或质心。这些见解为基于几何解释的鲁棒IRL方法奠定了基础,提供了一种结构化途径来应对高维场景中的奖励模糊性。