We approach the fundamental problem of obstacle avoidance for robotic systems via the lens of online learning. In contrast to prior work that either assumes worst-case realizations of uncertainty in the environment or a stationary stochastic model of uncertainty, we propose a method that is efficient to implement and provably grants instance-optimality with respect to perturbations of trajectories generated from an open-loop planner (in the sense of minimizing worst-case regret). The resulting policy adapts online to realizations of uncertainty and provably compares well with the best obstacle avoidance policy in hindsight from a rich class of policies. The method is validated in simulation on a dynamical system environment and compared to baseline open-loop planning and robust Hamilton- Jacobi reachability techniques. Further, it is implemented on a hardware example where a quadruped robot traverses a dense obstacle field and encounters input disturbances due to time delays, model uncertainty, and dynamics nonlinearities.
翻译:我们从在线学习的视角处理机器人系统避障这一基本问题。与先前工作中假设环境不确定性呈现最坏情况或服从平稳随机模型不同,我们提出一种高效实现的方法,该方法在扰动从开环规划器生成的轨迹方面(在最小化最坏情况遗憾的意义上)可证明地实现实例最优性。所得策略能在线适应不确定性的实际实现,并可从一类丰富策略中可证明地接近事后最优的避障策略。该方法在动力系统环境仿真中进行了验证,并与基线开环规划及鲁棒Hamilton-Jacobi可达性技术进行了对比。此外,该方法在硬件实验中得到实现:四足机器人在密集障碍物区域中穿行时,遭遇因时滞、模型不确定性和动力学非线性导致的输入扰动。