In the realm of automated robotic surgery and computer-assisted interventions, understanding robotic surgical activities stands paramount. Existing algorithms dedicated to surgical activity recognition predominantly cater to pre-defined closed-set paradigms, ignoring the challenges of real-world open-set scenarios. Such algorithms often falter in the presence of test samples originating from classes unseen during training phases. To tackle this problem, we introduce an innovative Open-Set Surgical Activity Recognition (OSSAR) framework. Our solution leverages the hyperspherical reciprocal point strategy to enhance the distinction between known and unknown classes in the feature space. Additionally, we address the issue of over-confidence in the closed set by refining model calibration, avoiding misclassification of unknown classes as known ones. To support our assertions, we establish an open-set surgical activity benchmark utilizing the public JIGSAWS dataset. Besides, we also collect a novel dataset on endoscopic submucosal dissection for surgical activity tasks. Extensive comparisons and ablation experiments on these datasets demonstrate the significant outperformance of our method over existing state-of-the-art approaches. Our proposed solution can effectively address the challenges of real-world surgical scenarios. Our code is publicly accessible at https://github.com/longbai1006/OSSAR.
翻译:在自动化机器人手术和计算机辅助干预领域,理解机器人手术活动至关重要。现有专注于手术活动识别的算法主要针对预定义的封闭集范式,忽视了现实世界中开放集场景的挑战。此类算法在遇到训练阶段未见类别的测试样本时往往表现不佳。为解决这一问题,我们提出了一种创新的开放集手术活动识别(OSSAR)框架。我们的解决方案利用超球面互反点策略来增强特征空间中已知类与未知类之间的区分度。此外,我们通过改进模型校准来解决封闭集中的过度自信问题,避免将未知类误分类为已知类。为验证我们的主张,我们利用公开的JIGSAWS数据集建立了一个开放集手术活动基准。同时,我们还为手术活动任务收集了一个新的内镜黏膜下剥离数据集。在这些数据集上的广泛比较和消融实验表明,我们的方法显著优于现有最先进方法。我们提出的方案能有效应对现实手术场景中的挑战。我们的代码公开于:https://github.com/longbai1006/OSSAR。