Guided trajectory planning involves a leader robotic agent strategically directing a follower robotic agent to collaboratively reach a designated destination. However, this task becomes notably challenging when the leader lacks complete knowledge of the follower's decision-making model. There is a need for learning-based methods to effectively design the cooperative plan. To this end, we develop a Stackelberg game-theoretic approach based on Koopman operator to address the challenge. We first formulate the guided trajectory planning problem through the lens of a dynamic Stackelberg game. We then leverage Koopman operator theory to acquire a learning-based linear system model that approximates the follower's feedback dynamics. Based on this learned model, the leader devises a collision-free trajectory to guide the follower, employing receding horizon planning. We use simulations to elaborate the effectiveness of our approach in generating learning models that accurately predict the follower's multi-step behavior when compared to alternative learning techniques. Moreover, our approach successfully accomplishes the guidance task and notably reduces the leader's planning time to nearly half when contrasted with the model-based baseline method.
翻译:引导式轨迹规划涉及领导机器人智能体策略性地引导跟随机器人智能体协同到达指定目的地。然而,当领导者对跟随者的决策模型缺乏完整认知时,这一任务将变得极具挑战性。因此,需要借助基于学习的方法来有效设计协同方案。为此,我们提出了一种基于Koopman算子的Stackelberg博弈方法以应对该挑战。首先,通过动态Stackelberg博弈的视角对引导式轨迹规划问题进行建模。随后,利用Koopman算子理论获取基于学习的线性系统模型,该模型可近似描述跟随者的反馈动力学特性。基于此学习模型,领导者采用滚动时域规划方法设计无碰撞轨迹以引导跟随者。仿真结果验证了所提方法在生成学习模型方面的有效性——相较于其他学习技术,该模型能更精确地预测跟随者的多步行为。此外,本方法成功完成了引导任务,且相较基于模型的基线方法,领导者规划时间显著降低近50%。