This paper presents policy-based motion planning for robotic systems. The motion planning literature has been mostly focused on open-loop trajectory planning which is followed by tracking online. In contrast, we solve the problem of path planning and controller synthesis simultaneously by solving the related feedback control problem. We present a novel incremental policy (iPolicy) algorithm for motion planning, which integrates sampling-based methods and set-valued optimal control methods to compute feedback controllers for the robotic system. In particular, we use sampling to incrementally construct the state space of the system. Asynchronous value iterations are performed on the sampled state space to synthesize the incremental policy feedback controller. We show the convergence of the estimates to the optimal value function in continuous state space. Numerical results with various different dynamical systems (including nonholonomic systems) verify the optimality and effectiveness of iPolicy.
翻译:本文提出了一种面向机器人系统的策略式运动规划方法。现有运动规划文献主要关注开环轨迹规划,随后通过在线跟踪实现控制。与此不同,我们通过求解相关反馈控制问题,同时实现路径规划与控制器综合。本文提出了一种新颖的增量式策略(iPolicy)运动规划算法,该算法融合了基于采样的方法与集值最优控制方法,为机器人系统计算反馈控制器。具体而言,我们利用采样增量式构建系统的状态空间,并在采样状态空间上执行异步值迭代,以合成增量式策略反馈控制器。我们证明了在连续状态空间中估计值向最优值函数的收敛性。针对多种动力学系统(包括非完整系统)的数值结果验证了iPolicy算法的最优性与有效性。