Purpose: Advances in surgical phase recognition are generally led by training deeper networks. Rather than going further with a more complex solution, we believe that current models can be exploited better. We propose a self-knowledge distillation framework that can be integrated into current state-of-the-art (SOTA) models without requiring any extra complexity to the models or annotations. Methods: Knowledge distillation is a framework for network regularization where knowledge is distilled from a teacher network to a student network. In self-knowledge distillation, the student model becomes the teacher such that the network learns from itself. Most phase recognition models follow an encoder-decoder framework. Our framework utilizes self-knowledge distillation in both stages. The teacher model guides the training process of the student model to extract enhanced feature representations from the encoder and build a more robust temporal decoder to tackle the over-segmentation problem. Results: We validate our proposed framework on the public dataset Cholec80. Our framework is embedded on top of four popular SOTA approaches and consistently improves their performance. Specifically, our best GRU model boosts performance by +3.33% accuracy and +3.95% F1-score over the same baseline model. Conclusion: We embed a self-knowledge distillation framework for the first time in the surgical phase recognition training pipeline. Experimental results demonstrate that our simple yet powerful framework can improve performance of existing phase recognition models. Moreover, our extensive experiments show that even with 75% of the training set we still achieve performance on par with the same baseline model trained on the full set.
翻译:目的:手术阶段识别领域的进展通常依赖于训练更深层的网络。我们认为,与其采用更复杂的解决方案,不如更好地利用现有模型。为此,我们提出一种自知识蒸馏框架,该框架可集成到当前最先进的模型中,且无需增加模型或标注的额外复杂度。方法:知识蒸馏是一种网络正则化框架,通过将知识从教师网络蒸馏至学生网络实现。在自知识蒸馏中,学生模型自身成为教师,使网络能够自我学习。多数阶段识别模型采用编码器-解码器架构。我们的框架在两个阶段均应用自知识蒸馏:教师模型指导学生模型的训练过程,以从编码器中提取增强的特征表示,并构建更鲁棒的时间解码器来解决过分割问题。结果:我们在公开数据集Cholec80上验证了所提出框架。该框架嵌入四种主流先进方法后,均能持续提升其性能。具体而言,我们的最佳GRU模型在相同基线模型基础上,准确率提升+3.33%,F1分数提升+3.95%。结论:我们首次将自知识蒸馏框架嵌入手术阶段识别训练流程。实验结果表明,这一简洁而强大的框架能够提升现有阶段识别模型的性能。此外,大量实验显示,即便仅使用训练集的75%,我们仍能达到与完整训练集训练的基线模型相当的性能。