In recent years, the landscape of computer-assisted interventions and post-operative surgical video analysis has been dramatically reshaped by deep-learning techniques, resulting in significant advancements in surgeons' skills, operation room management, and overall surgical outcomes. However, the progression of deep-learning-powered surgical technologies is profoundly reliant on large-scale datasets and annotations. Particularly, surgical scene understanding and phase recognition stand as pivotal pillars within the realm of computer-assisted surgery and post-operative assessment of cataract surgery videos. In this context, we present the largest cataract surgery video dataset that addresses diverse requisites for constructing computerized surgical workflow analysis and detecting post-operative irregularities in cataract surgery. We validate the quality of annotations by benchmarking the performance of several state-of-the-art neural network architectures for phase recognition and surgical scene segmentation. Besides, we initiate the research on domain adaptation for instrument segmentation in cataract surgery by evaluating cross-domain instrument segmentation performance in cataract surgery videos. The dataset and annotations will be publicly available upon acceptance of the paper.
翻译:近年来,深度学习技术深刻重塑了计算机辅助干预与术后手术视频分析的格局,显著提升了外科医生技能、手术室管理效能及整体手术效果。然而,基于深度学习的手术技术的进步高度依赖于大规模数据集与标注。特别是,手术场景理解与阶段识别作为计算机辅助手术及白内障手术术后评估领域的关键支柱,具有重要地位。在此背景下,我们提出了规模最大的白内障手术视频数据集,该数据集可满足构建计算机化手术流程分析并检测白内障手术术后异常的多方面需求。通过评估多种先进神经网络架构在阶段识别与手术场景分割任务中的性能,我们验证了标注质量。此外,我们通过评估白内障手术视频中的跨域器械分割性能,开启了针对白内障手术器械分割的领域自适应研究。数据集与标注将在论文接收后公开提供。