Self-Knowledge Distillation for Surgical Phase Recognition

Purpose: Advances in surgical phase recognition are generally led by training deeper networks. Rather than going further with a more complex solution, we believe that current models can be exploited better. We propose a self-knowledge distillation framework that can be integrated into current state-of-the-art (SOTA) models without requiring any extra complexity to the models or annotations. Methods: Knowledge distillation is a framework for network regularization where knowledge is distilled from a teacher network to a student network. In self-knowledge distillation, the student model becomes the teacher such that the network learns from itself. Most phase recognition models follow an encoder-decoder framework. Our framework utilizes self-knowledge distillation in both stages. The teacher model guides the training process of the student model to extract enhanced feature representations from the encoder and build a more robust temporal decoder to tackle the over-segmentation problem. Results: We validate our proposed framework on the public dataset Cholec80. Our framework is embedded on top of four popular SOTA approaches and consistently improves their performance. Specifically, our best GRU model boosts performance by +3.33% accuracy and +3.95% F1-score over the same baseline model. Conclusion: We embed a self-knowledge distillation framework for the first time in the surgical phase recognition training pipeline. Experimental results demonstrate that our simple yet powerful framework can improve performance of existing phase recognition models. Moreover, our extensive experiments show that even with 75% of the training set we still achieve performance on par with the same baseline model trained on the full set.

翻译：目的：手术阶段识别领域的进展通常依赖于训练更深层的网络。我们认为，与其采用更复杂的解决方案，不如更好地利用现有模型。为此，我们提出一种自知识蒸馏框架，该框架可集成到当前最先进的模型中，且无需增加模型或标注的额外复杂度。方法：知识蒸馏是一种网络正则化框架，通过将知识从教师网络蒸馏至学生网络实现。在自知识蒸馏中，学生模型自身成为教师，使网络能够自我学习。多数阶段识别模型采用编码器-解码器架构。我们的框架在两个阶段均应用自知识蒸馏：教师模型指导学生模型的训练过程，以从编码器中提取增强的特征表示，并构建更鲁棒的时间解码器来解决过分割问题。结果：我们在公开数据集Cholec80上验证了所提出框架。该框架嵌入四种主流先进方法后，均能持续提升其性能。具体而言，我们的最佳GRU模型在相同基线模型基础上，准确率提升+3.33%，F1分数提升+3.95%。结论：我们首次将自知识蒸馏框架嵌入手术阶段识别训练流程。实验结果表明，这一简洁而强大的框架能够提升现有阶段识别模型的性能。此外，大量实验显示，即便仅使用训练集的75%，我们仍能达到与完整训练集训练的基线模型相当的性能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日