In the context of label-efficient learning on video data, the distillation method and the structural design of the teacher-student architecture have a significant impact on knowledge distillation. However, the relationship between these factors has been overlooked in previous research. To address this gap, we propose a new weakly supervised learning framework for knowledge distillation in video classification that is designed to improve the efficiency and accuracy of the student model. Our approach leverages the concept of substage-based learning to distill knowledge based on the combination of student substages and the correlation of corresponding substages. We also employ the progressive cascade training method to address the accuracy loss caused by the large capacity gap between the teacher and the student. Additionally, we propose a pseudo-label optimization strategy to improve the initial data label. To optimize the loss functions of different distillation substages during the training process, we introduce a new loss method based on feature distribution. We conduct extensive experiments on both real and simulated data sets, demonstrating that our proposed approach outperforms existing distillation methods in terms of knowledge distillation for video classification tasks. Our proposed substage-based distillation approach has the potential to inform future research on label-efficient learning for video data.
翻译:在视频数据的标签高效学习背景下,蒸馏方法及师生架构的结构设计对知识蒸馏具有显著影响。然而,先前研究忽视了这些因素之间的关联性。为解决这一问题,我们提出一种面向视频分类知识蒸馏的新型弱监督学习框架,旨在提升学生模型的效率与精度。该方法利用基于子阶段学习的概念,通过整合学生子阶段及其对应阶段的关联性进行知识蒸馏。同时采用渐进式级联训练技术,以解决师生模型容量差异过大导致的精度损失。此外,我们提出伪标签优化策略改进初始数据标签。为优化训练过程中不同蒸馏子阶段的损失函数,我们引入基于特征分布的新型损失方法。通过在真实与模拟数据集上的广泛实验,证明所提方法在视频分类任务的知识蒸馏性能上优于现有方法。本研究的基于子阶段蒸馏方法有望为未来视频数据标签高效学习研究提供启示。