This paper addresses the online motion planning problem of mobile robots under complex high-level tasks. The robot motion is modeled as an uncertain Markov Decision Process (MDP) due to limited initial knowledge, while the task is specified as Linear Temporal Logic (LTL) formulas. The proposed framework enables the robot to explore and update the system model in a Bayesian way, while simultaneously optimizing the asymptotic costs of satisfying the complex temporal task. Theoretical guarantees are provided for the synthesized outgoing policy and safety policy. More importantly, instead of greedy exploration under the classic ergodicity assumption, a safe-return requirement is enforced such that the robot can always return to home states with a high probability. The overall methods are validated by numerical simulations.
翻译:本文针对移动机器人在复杂高层级任务下的在线运动规划问题展开研究。由于初始知识有限,机器人运动被建模为不确定马尔可夫决策过程(MDP),而任务则以线性时序逻辑(LTL)公式进行描述。所提出的框架使机器人能够以贝叶斯方式探索并更新系统模型,同时优化满足复杂时间任务的渐近代价。文中为合成的外出策略与安全策略提供了理论保证。更重要的是,区别于经典遍历性假设下的贪婪探索,该方法强制执行一项安全返回约束,使得机器人能够以高概率始终返回初始状态。整体方法通过数值仿真进行了验证。