Hierarchical Motion Planning under Probabilistic Temporal Tasks and Safe-Return Constraints

Safety is crucial for robotic missions within an uncertain environment. Common safety requirements such as collision avoidance are only state-dependent, which can be restrictive for complex missions. In this work, we address a more general formulation as safe-return constraints, which require the existence of a return-policy to drive the system back to a set of safe states with high probability. The robot motion is modeled as a Markov Decision Process (MDP) with probabilistic labels, which can be highly non-ergodic. The robotic task is specified as Linear Temporal Logic (LTL) formulas over these labels, such as surveillance and transportation. We first provide theoretical guarantees on the re-formulation of such safe-return constraints, and a baseline solution based on computing two complete product automata. Furthermore, to tackle the computational complexity, we propose a hierarchical planning algorithm that combines the feature-based symbolic and temporal abstraction with constrained optimization. It synthesizes simultaneously two dependent motion policies: the outbound policy minimizes the overall cost of satisfying the task with a high probability, while the return policy ensures the safe-return constraints. The problem formulation is versatile regarding the robot model, task specifications and safety constraints. The proposed hierarchical algorithm is more efficient and can solve much larger problems than the baseline solution, with only a slight loss of optimality. Numerical validations include simulations and hardware experiments of a search-and-rescue mission and a planetary exploration mission over various system sizes.

翻译：安全性对于不确定环境中的机器人任务至关重要。常见的碰撞避免等安全要求仅依赖于状态，这可能会限制复杂任务的实施。本文针对更一般的安全返回约束问题展开研究，该约束要求存在一个返回策略，能以高概率将系统驱动回一组安全状态。机器人运动被建模为具有概率标签的马尔可夫决策过程（MDP），该过程可能呈现高度非遍历性。机器人任务通过线性时序逻辑（LTL）公式对这些标签进行规范，例如监视和运输。我们首先为这种安全返回约束的重构提供理论保证，并提出一种基于计算两个完整乘积自动机的基线解决方案。此外，为应对计算复杂性，我们提出了一种结合基于特征的符号抽象与时序抽象的层次化规划算法，并融入约束优化。该算法同时合成两个相互依赖的运动策略：外出策略以高概率最小化满足任务的总成本，而返回策略则确保安全返回约束。该问题公式对机器人模型、任务规范和安全约束具有通用性。所提出的层次化算法比基线解决方案效率更高，能解决规模大得多的问题，且仅牺牲少量最优性。数值验证包括不同系统规模下的搜救任务和行星探索任务的仿真及硬件实验。