This work introduces B-spline Movement Primitives (BMPs), a new Movement Primitive (MP) variant that leverages B-splines for motion representation. B-splines are a well-known concept in motion planning due to their ability to generate complex, smooth trajectories with only a few control points while satisfying boundary conditions, i.e., passing through a specified desired position with desired velocity. However, current usages of B-splines tend to ignore the higher-order statistics in trajectory distributions, which limits their usage in imitation learning (IL) and reinforcement learning (RL), where modeling trajectory distribution is essential. In contrast, MPs are commonly used in IL and RL for their capacity to capture trajectory likelihoods and correlations. However, MPs are constrained by their abilities to satisfy boundary conditions and usually need extra terms in learning objectives to satisfy velocity constraints. By reformulating B-splines as MPs, represented through basis functions and weight parameters, BMPs combine the strengths of both approaches, allowing B-splines to capture higher-order statistics while retaining their ability to satisfy boundary conditions. Empirical results in IL and RL demonstrate that BMPs broaden the applicability of B-splines in robot learning and offer greater expressiveness compared to existing MP variants.
翻译:本文提出了一种新的运动基元变体——B样条运动基元,它利用B样条进行运动表示。B样条在运动规划中是一个广为人知的概念,因为它能够仅用少量控制点生成复杂、平滑的轨迹,同时满足边界条件,即通过指定的期望位置并具有期望速度。然而,当前B样条的使用往往忽略了轨迹分布中的高阶统计信息,这限制了其在模仿学习和强化学习中的应用,而在这些领域中建模轨迹分布至关重要。相比之下,运动基元因其能够捕捉轨迹似然和相关性的能力,在模仿学习和强化学习中得到了广泛应用。但运动基元受限于其满足边界条件的能力,通常需要在学习目标中添加额外项以满足速度约束。通过将B样条重新表述为通过基函数和权重参数表示的运动基元,BMP结合了两种方法的优势,使得B样条能够捕捉高阶统计信息,同时保留其满足边界条件的能力。在模仿学习和强化学习中的实证结果表明,与现有的运动基元变体相比,BMP拓宽了B样条在机器人学习中的适用性,并提供了更强的表达能力。