Despite their mechanical sophistication, surgical robots remain blind to their surroundings. This lack of spatial awareness causes collisions, system recoveries, and workflow disruptions, issues that will intensify with the introduction of distributed robots with independent interacting arms. Existing tracking systems rely on bulky infrared cameras and reflective markers, providing only limited views of the surgical scene and adding hardware burden in crowded operating rooms. We present a marker-free proprioception method that enables precise localisation of surgical robots under their sterile draping despite associated obstruction of visual cues. Our method solely relies on lightweight stereo-RGB cameras and novel transformer-based deep learning models. It builds on the largest multi-centre spatial robotic surgery dataset to date (1.4M self-annotated images from human cadaveric and preclinical in vivo studies). By tracking the entire robot and surgical scene, rather than individual markers, our approach provides a holistic view robust to occlusions, supporting surgical scene understanding and context-aware control. We demonstrate an example of potential clinical benefits during in vivo breathing compensation with access to tissue dynamics, unobservable under state of the art tracking, and accurately locate in multi-robot systems for future intelligent interaction. In addition, and compared with existing systems, our method eliminates markers and improves tracking visibility by 25%. To our knowledge, this is the first demonstration of marker-free proprioception for fully draped surgical robots, reducing setup complexity, enhancing safety, and paving the way toward modular and autonomous robotic surgery.
翻译:尽管手术机器人机械结构精密,但它们对周围环境仍处于"盲视"状态。这种空间感知能力的缺失导致碰撞、系统复位和工作流程中断等问题,而随着配备独立交互臂的分布式机器人的引入,这些问题将愈发严重。现有追踪系统依赖笨重的红外相机和反光标记,仅能提供有限的手术场景视野,并在拥挤的手术室中增加硬件负担。我们提出了一种无标记的本体感知方法,能够在无菌覆盖物遮挡视觉线索的情况下,实现手术机器人的精确定位。该方法仅需轻量级立体RGB相机和基于Transformer的新型深度学习模型。其构建于迄今最大的多中心空间机器人手术数据集之上(来自人体尸体和临床前活体研究的140万张自标注图像)。通过追踪整个机器人和手术场景而非单个标记,我们的方法提供了对遮挡具有鲁棒性的整体视野,支持手术场景理解和情境感知控制。我们展示了该方法在活体呼吸补偿中的潜在临床效益示例,能够获取当前最先进追踪技术无法观测的组织动力学信息,并在多机器人系统中实现精确定位以支持未来的智能交互。此外,与现有系统相比,我们的方法消除了标记物需求,并将追踪可视性提升了25%。据我们所知,这是首次实现全覆式手术机器人无标记本体感知的演示,可降低设置复杂度、提升安全性,并为模块化自主机器人手术开辟道路。