CholecTrack20: A Dataset for Multi-Class Multiple Tool Tracking in Laparoscopic Surgery

Tool tracking in surgical videos is vital in computer-assisted intervention for tasks like surgeon skill assessment, safety zone estimation, and human-machine collaboration during minimally invasive procedures. The lack of large-scale datasets hampers Artificial Intelligence implementation in this domain. Current datasets exhibit overly generic tracking formalization, often lacking surgical context: a deficiency that becomes evident when tools move out of the camera's scope, resulting in rigid trajectories that hinder realistic surgical representation. This paper addresses the need for a more precise and adaptable tracking formalization tailored to the intricacies of endoscopic procedures by introducing CholecTrack20, an extensive dataset meticulously annotated for multi-class multi-tool tracking across three perspectives representing the various ways of considering the temporal duration of a tool trajectory: (1) intraoperative, (2) intracorporeal, and (3) visibility within the camera's scope. The dataset comprises 20 laparoscopic videos with over 35,000 frames and 65,000 annotated tool instances with details on spatial location, category, identity, operator, phase, and surgical visual conditions. This detailed dataset caters to the evolving assistive requirements within a procedure.

翻译：手术视频中的工具追踪在计算机辅助干预中至关重要，可应用于微创手术中的外科医生技能评估、安全区域估计以及人机协作等任务。大规模数据集的缺乏阻碍了人工智能在该领域的发展。现有数据集存在过于通用的追踪形式化问题，往往缺乏手术背景——当工具移出摄像机视野时，这一缺陷尤为明显，导致工具轨迹僵硬，难以真实反映手术过程。本文针对内窥镜手术的复杂性，提出了一种更精确、更灵活的追踪形式化方法，并由此引入CholecTrack20：一个大规模数据集，该数据集从三个视角（分别代表工具轨迹时间维度的不同考量方式：术中、体内、以及摄像机视野可见性）对多类别多工具追踪进行了精细标注。该数据集包含20个腹腔镜视频，涵盖超过35,000帧图像和65,000个标注的工具实例，并提供了空间位置、类别、身份标识、操作者、手术阶段以及手术视觉条件等详细信息。该详细数据集满足了手术过程中不断演变的辅助需求。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日