Real2Sim is becoming increasingly important with the rapid development of surgical artificial intelligence (AI) and autonomy. In this work, we propose a novel Real2Sim methodology, \textit{Instrument-Splatting}, that leverages 3D Gaussian Splatting to provide fully controllable 3D reconstruction of surgical instruments from monocular surgical videos. To maintain both high visual fidelity and manipulability, we introduce a geometry pre-training to bind Gaussian point clouds on part mesh with accurate geometric priors and define a forward kinematics to control the Gaussians as flexible as real instruments. Afterward, to handle unposed videos, we design a novel instrument pose tracking method leveraging semantics-embedded Gaussians to robustly refine per-frame instrument poses and joint states in a render-and-compare manner, which allows our instrument Gaussian to accurately learn textures and reach photorealistic rendering. We validated our method on 2 publicly released surgical videos and 4 videos collected on ex vivo tissues and green screens. Quantitative and qualitative evaluations demonstrate the effectiveness and superiority of the proposed method.
翻译:随着手术人工智能与自主系统的快速发展,真实到仿真的转换技术正变得日益重要。本研究提出一种新颖的真实到仿真方法——\textit{Instrument-Splatting},该方法利用3D高斯溅射技术,从单目手术视频中实现手术器械的完全可控三维重建。为同时保持高视觉保真度与可操控性,我们引入几何预训练方法,将高斯点云绑定至带有精确几何先验的部件网格上,并定义前向运动学以像真实器械般灵活控制高斯分布。随后,为处理无位姿标注视频,我们设计了一种新颖的手术器械位姿跟踪方法,该方法利用语义嵌入的高斯分布,以渲染-比较的方式鲁棒地优化每帧器械位姿与关节状态,从而使器械高斯模型能准确学习纹理并实现逼真渲染。我们在2段公开手术视频及4段离体组织和绿幕采集视频上验证了本方法。定量与定性评估均证明了所提方法的有效性与优越性。