Hand-specific localization has garnered significant interest within the computer vision community. Although there are numerous datasets with hand annotations from various angles and settings, domain transfer techniques frequently struggle in surgical environments. This is mainly due to the limited availability of gloved hand instances and the unique challenges of operating rooms (ORs). Thus, hand-detection models tailored to OR settings require extensive training and expensive annotation processes. To overcome these challenges, we present "RoHan" - a novel approach for robust hand detection in the OR, leveraging advanced semi-supervised domain adaptation techniques to tackle the challenges of varying recording conditions, diverse glove colors, and occlusions common in surgical settings. Our methodology encompasses two main stages: (1) data augmentation strategy that utilizes "Artificial Gloves," a method for augmenting publicly available hand datasets with synthetic images of hands-wearing gloves; (2) semi-supervised domain adaptation pipeline that improves detection performance in real-world OR settings through iterative prediction refinement and efficient frame filtering. We evaluate our method using two datasets: simulated enterotomy repair and saphenous vein graft harvesting. "RoHan" substantially reduces the need for extensive labeling and model training, paving the way for the practical implementation of hand detection technologies in medical settings.
翻译:手部特定定位在计算机视觉领域引起了广泛关注。尽管存在大量包含多角度、多场景手部标注的数据集,但领域迁移技术在手术环境中常常表现不佳。这主要归因于戴手套手部实例的有限可用性以及手术室特有的挑战。因此,针对手术室环境定制的手部检测模型需要大量的训练和昂贵的标注过程。为克服这些挑战,我们提出了"RoHan"——一种用于手术室鲁棒性手部检测的新方法,该方法利用先进的半监督领域自适应技术,以应对手术环境中常见的记录条件多变、手套颜色多样及遮挡等挑战。我们的方法包含两个主要阶段:(1) 采用"人工手套"的数据增强策略,该方法通过合成戴手套手部图像来增强公开可用手部数据集;(2) 半监督领域自适应流程,通过迭代预测优化和高效帧过滤,提升在真实手术室环境中的检测性能。我们使用两个数据集评估了我们的方法:模拟肠切开修复术和大隐静脉移植采集术。"RoHan"显著减少了对大量标注和模型训练的需求,为手部检测技术在医疗环境中的实际应用铺平了道路。