Autonomous navigation for mechanical thrombectomy (MT) remains a critical challenge due to the complexity of vascular anatomy and the need for precise, real-time decision-making. Reinforcement learning (RL)-based approaches have demonstrated potential in automating endovascular navigation, but current methods often struggle with generalization across multiple patient vasculatures and long-horizon tasks. We propose a world model for autonomous endovascular navigation using TD-MPC2, a model-based RL algorithm. We trained a single RL agent across multiple endovascular navigation tasks in ten real patient vasculatures, comparing performance against the state-of-the-art Soft Actor-Critic (SAC) method. Results indicate that TD-MPC2 significantly outperforms SAC in multi-task learning, achieving a 65% mean success rate compared to SAC's 37%, with notable improvements in path ratio. TD-MPC2 exhibited increased procedure times, suggesting a trade-off between success rate and execution speed. These findings highlight the potential of world models for improving autonomous endovascular navigation and lay the foundation for future research in generalizable AI-driven robotic interventions.
翻译:机械取栓术(MT)中的自主导航仍然是一个关键挑战,这源于血管解剖结构的复杂性以及对精确、实时决策的需求。基于强化学习(RL)的方法在实现血管内导航自动化方面已展现出潜力,但现有方法通常难以在多个患者血管结构和长视野任务中实现泛化。我们提出了一种使用基于模型的RL算法TD-MPC2的自主血管内导航世界模型。我们在十个真实患者血管结构中,针对多个血管内导航任务训练了一个单一的RL智能体,并将其性能与最先进的软演员-评论家(SAC)方法进行了比较。结果表明,在多任务学习中,TD-MPC2显著优于SAC,平均成功率达到65%,而SAC为37%,且在路径比方面有显著提升。TD-MPC2表现出更长的操作时间,这表明在成功率和执行速度之间存在权衡。这些发现凸显了世界模型在改进自主血管内导航方面的潜力,并为未来可泛化的AI驱动机器人介入研究奠定了基础。