We present hierarchical policy blending as optimal transport (HiPBOT). This hierarchical framework adapts the weights of low-level reactive expert policies, adding a look-ahead planning layer on the parameter space of a product of expert policies and agents. Our high-level planner realizes a policy blending via unbalanced optimal transport, consolidating the scaling of underlying Riemannian motion policies, effectively adjusting their Riemannian matrix, and deciding over the priorities between experts and agents, guaranteeing safety and task success. Our experimental results in a range of application scenarios from low-dimensional navigation to high-dimensional whole-body control showcase the efficacy and efficiency of HiPBOT, which outperforms state-of-the-art baselines that either perform probabilistic inference or define a tree structure of experts, paving the way for new applications of optimal transport to robot control. More material at https://sites.google.com/view/hipobot
翻译:我们提出分层策略混合作为最优运输(HiPBOT)。该分层框架通过调整低层反应式专家策略的权重,在专家策略与智能体的乘积策略参数空间中添加前瞻规划层。高层规划器通过非平衡最优运输实现策略混合,整合底层黎曼运动策略的缩放,有效调整其黎曼矩阵,并决定专家与智能体之间的优先级,从而保障安全性与任务成功率。我们在从低维导航到高维全身控制等多种应用场景中的实验结果表明,HiPBOT在效能与效率上均优于现有基线方法(包括执行概率推理或定义专家树结构的方法),为最优运输在机器人控制中的新应用开辟了道路。更多资料请访问https://sites.google.com/view/hipobot