Real-time visual feedback from catheterization analysis is crucial for enhancing surgical safety and efficiency during endovascular interventions. However, existing datasets are often limited to specific tasks, small scale, and lack the comprehensive annotations necessary for broader endovascular intervention understanding. To tackle these limitations, we introduce CathAction, a large-scale dataset for catheterization understanding. Our CathAction dataset encompasses approximately 500,000 annotated frames for catheterization action understanding and collision detection, and 25,000 ground truth masks for catheter and guidewire segmentation. For each task, we benchmark recent related works in the field. We further discuss the challenges of endovascular intentions compared to traditional computer vision tasks and point out open research questions. We hope that CathAction will facilitate the development of endovascular intervention understanding methods that can be applied to real-world applications. The dataset is available at https://airvlab.github.io/cathdata/.
翻译:导管插入术分析的实时视觉反馈对于提升血管内介入手术的安全性与效率至关重要。然而,现有数据集通常局限于特定任务、规模较小,且缺乏支撑更广泛血管内介入理解所需的全面标注。为应对这些局限,我们提出了CathAction,一个用于导管插入理解的大规模数据集。我们的CathAction数据集包含约500,000帧用于导管插入动作理解与碰撞检测的标注帧,以及25,000个用于导管与导丝分割的真实掩码。针对每项任务,我们对领域内近期相关研究进行了基准测试。我们进一步探讨了血管内介入意图相较于传统计算机视觉任务的挑战,并指出了开放的研究问题。我们希望CathAction能够推动可应用于实际场景的血管内介入理解方法的发展。数据集可通过 https://airvlab.github.io/cathdata/ 获取。