Purpose: We propose a formal framework for the modeling and segmentation of minimally-invasive surgical tasks using a unified set of motion primitives (MPs) to enable more objective labeling and the aggregation of different datasets. Methods: We model dry-lab surgical tasks as finite state machines, representing how the execution of MPs as the basic surgical actions results in the change of surgical context, which characterizes the physical interactions among tools and objects in the surgical environment. We develop methods for labeling surgical context based on video data and for automatic translation of context to MP labels. We then use our framework to create the COntext and Motion Primitive Aggregate Surgical Set (COMPASS), including six dry-lab surgical tasks from three publicly-available datasets (JIGSAWS, DESK, and ROSMA), with kinematic and video data and context and MP labels. Results: Our context labeling method achieves near-perfect agreement between consensus labels from crowd-sourcing and expert surgeons. Segmentation of tasks to MPs results in the creation of the COMPASS dataset that nearly triples the amount of data for modeling and analysis and enables the generation of separate transcripts for the left and right tools. Conclusion: The proposed framework results in high quality labeling of surgical data based on context and fine-grained MPs. Modeling surgical tasks with MPs enables the aggregation of different datasets and the separate analysis of left and right hands for bimanual coordination assessment. Our formal framework and aggregate dataset can support the development of explainable and multi-granularity models for improved surgical process analysis, skill assessment, error detection, and autonomy.
翻译:目的:我们提出一种形式化框架,用于基于统一运动基元集合对微创手术任务进行建模与分割,以实现更客观的标注及不同数据集的聚合。方法:我们将干实验室手术任务建模为有限状态机,通过运动基元作为基本手术动作的执行过程,描述手术情境的变化(即手术环境中器械与物体间的物理交互)。我们开发了基于视频数据标注手术情境的方法,以及将情境自动转换为运动基元标签的技术。利用该框架,我们构建了情境与运动基元聚合手术集(COMPASS),包含来自三个公开数据集(JIGSAWS、DESK和ROSMA)的六项干实验室手术任务,提供运动学与视频数据以及情境与运动基元标签。结果:我们的情境标注方法在众包标注与专家外科医生的共识标签之间实现了近乎完美的一致性。将任务分割为运动基元后形成的COMPASS数据集,使建模与分析数据量增加近三倍,并可生成左右手独立操作记录。结论:所提框架实现了基于情境和细粒度运动基元的高质量手术数据标注。基于运动基元的手术任务建模能够实现不同数据集的聚合及左右手操作的独立分析,用于双手协调性评估。该形式化框架与聚合数据集可支持可解释多粒度模型的开发,以提升手术流程分析、技能评估、错误检测及自动化水平。