Surgical Phase and Instrument Recognition: How to identify appropriate Dataset Splits

from arxiv, Accepted at the 14th International Conference on Information Processing in Computer-Assisted Interventions (IPCAI 2023); 9 pages, 4 figures, 1 table

Purpose: The development of machine learning models for surgical workflow and instrument recognition from temporal data represents a challenging task due to the complex nature of surgical workflows. In particular, the imbalanced distribution of data is one of the major challenges in the domain of surgical workflow recognition. In order to obtain meaningful results, careful partitioning of data into training, validation, and test sets, as well as the selection of suitable evaluation metrics are crucial. Methods: In this work, we present an openly available web-based application that enables interactive exploration of dataset partitions. The proposed visual framework facilitates the assessment of dataset splits for surgical workflow recognition, especially with regard to identifying sub-optimal dataset splits. Currently, it supports visualization of surgical phase and instrument annotations. Results: In order to validate the dedicated interactive visualizations, we use a dataset split of the Cholec80 dataset. This dataset split was specifically selected to reflect a case of strong data imbalance. Using our software, we were able to identify phases, phase transitions, and combinations of surgical instruments that were not represented in one of the sets. Conclusion: In order to obtain meaningful results in highly unbalanced class distributions, special care should be taken with respect to the selection of an appropriate split. Interactive data visualization represents a promising approach for the assessment of machine learning datasets. The source code is available at https://github.com/Cardio-AI/endovis-ml

翻译：摘要：目的：由于手术工作流程的复杂特性，基于时序数据开发用于手术流程及器械识别的机器学习模型是一项极具挑战性的任务。其中，数据分布不均衡是手术流程识别领域面临的主要挑战之一。为获得有意义的结果，需谨慎划分训练集、验证集和测试集，并选择适当的评估指标。方法：本研究提出了一款公开可用的网页应用程序，支持对数据集划分进行交互式探索。所提出的可视化框架有助于评估手术流程识别的数据集划分，尤其在识别次优划分方面。目前，该工具支持手术阶段与器械标注的可视化。结果：为验证该交互式可视化工具，我们采用Cholec80数据集的一个划分进行测试。该划分特意选取了呈现严重数据不均衡的案例。通过我们的软件，可识别出在某一子集中缺失的手术阶段、阶段转换及器械组合。结论：在高度不均衡的类别分布中，为获取有意义的结果，需特别注意选择适当的数据集划分。交互式数据可视化是评估机器学习数据集的一种有前景的方法。源代码见：https://github.com/Cardio-AI/endovis-ml