Localizing oneself during endoscopic procedures can be problematic due to the lack of distinguishable textures and landmarks, as well as difficulties due to the endoscopic device such as a limited field of view and challenging lighting conditions. Expert knowledge shaped by years of experience is required for localization within the human body during endoscopic procedures. In this work, we present a deep learning method based on anatomy recognition, that constructs a surgical path in an unsupervised manner from surgical videos, modelling relative location and variations due to different viewing angles. At inference time, the model can map an unseen video's frames on the path and estimate the viewing angle, aiming to provide guidance, for instance, to reach a particular destination. We test the method on a dataset consisting of surgical videos of transsphenoidal adenomectomies, as well as on a synthetic dataset. An online tool that lets researchers upload their surgical videos to obtain anatomy detections and the weights of the trained YOLOv7 model are available at: https://surgicalvision.bmic.ethz.ch.
翻译:在内窥镜手术过程中,由于缺乏可区分的纹理和标志物,以及内窥镜设备本身带来的视野受限和光照条件困难等问题,医生常常难以进行自身定位。内窥镜手术中的人体内部定位需要凭借多年经验积累的专业知识。本研究提出一种基于解剖识别的深度学习方法,该方法能够以无监督方式从手术视频中构建手术路径,并对不同视角导致的相对位置和变化进行建模。在推理阶段,该模型可将未见过的视频帧映射到手术路径上,并估计视角方向,旨在提供导航支持(例如到达特定目标位置)。我们在包含经蝶窦腺瘤切除术手术视频的数据集以及合成数据集上对该方法进行了测试。在线工具(允许研究者上传手术视频以获取解剖检测结果及训练好的YOLOv7模型权重)可通过以下网址访问:https://surgicalvision.bmic.ethz.ch。