Depth Over RGB: Automatic Evaluation of Open Surgery Skills Using Depth Camera

Purpose: In this paper, we present a novel approach to the automatic evaluation of open surgery skills using depth cameras. This work is intended to show that depth cameras achieve similar results to RGB cameras, which is the common method in the automatic evaluation of open surgery skills. Moreover, depth cameras offer advantages such as robustness to lighting variations, camera positioning, simplified data compression, and enhanced privacy, making them a promising alternative to RGB cameras. Methods: Experts and novice surgeons completed two simulators of open suturing. We focused on hand and tool detection, and action segmentation in suturing procedures. YOLOv8 was used for tool detection in RGB and depth videos. Furthermore, UVAST and MSTCN++ were used for action segmentation. Our study includes the collection and annotation of a dataset recorded with Azure Kinect. Results: We demonstrated that using depth cameras in object detection and action segmentation achieves comparable results to RGB cameras. Furthermore, we analyzed 3D hand path length, revealing significant differences between experts and novice surgeons, emphasizing the potential of depth cameras in capturing surgical skills. We also investigated the influence of camera angles on measurement accuracy, highlighting the advantages of 3D cameras in providing a more accurate representation of hand movements. Conclusion: Our research contributes to advancing the field of surgical skill assessment by leveraging depth cameras for more reliable and privacy evaluations. The findings suggest that depth cameras can be valuable in assessing surgical skills and provide a foundation for future research in this area.

翻译：目的：本文提出一种利用深度摄像头自动评估开放式手术技能的新方法。本研究旨在证明深度摄像头能与当前开放式手术技能自动评估中普遍采用的RGB摄像头取得相近效果，同时深度摄像头具有对光照变化不敏感、摄像头定位灵活、数据压缩简化以及隐私保护增强等优势，使其成为RGB摄像头的理想替代方案。方法：专家医生与新手医生完成两项开放式缝合模拟任务。我们重点关注手术过程中的手部与工具检测及动作分割任务。采用YOLOv8对RGB与深度视频进行工具检测，并应用UVAST与MSTCN++进行动作分割。研究包含使用Azure Kinect采集的数据集的收集与标注。结果：实验证明，在目标检测与动作分割任务中，采用深度摄像头可获得与RGB摄像头相当的效果。此外，通过分析三维手部路径长度，揭示出专家与新手医生之间存在显著差异，凸显深度摄像头在捕捉手术技能方面的潜力。我们还探讨了摄像头角度对测量精度的影响，证实三维摄像头在更准确呈现手部运动轨迹方面的优势。结论：本研究通过引入深度摄像头推动手术技能评估领域发展，实现更可靠且更具隐私保护的评估。研究结果表明深度摄像头在手术技能评估中具有重要价值，为该领域后续研究奠定基础。

相关内容

Kinect

关注 1

Kinect for Xbox 360，简称 Kinect，是由微软开发，应用于 Xbox 360 主机的周边设备。它让玩家不需要手持或踩踏控制器，而是使用语音指令或手势来操作 Xbox 360 的系统界面。它也能捕捉玩家全身上下的动作，用身体来进行游戏，带给玩家“免控制器的游戏与娱乐体验”。 2009 年 6 月 1 日微软于 E3 游戏展中公布名为“Project Natal”（诞生计划）的感应器，它能够捕捉使用者的肢体动作，或是进行脸部辨识。感应器也内建麦克风，可以用来识别语音指令。此感应器兼容于所有 Xbox 360 主机，玩家只需新购此感应器就可直接使用。 2010 年的 E3 电玩展，微软宣布 Project Natal 的正式名称为“Kinect”，并预计在 2010 年 11 月 4 日于美国上市，建议售价 149 美金。台湾则在2010 年 11 月 20 日上市。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

14+阅读 · 2022年3月12日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日