In the context of label-efficient learning on video data, the distillation method and the structural design of the teacher-student architecture have a significant impact on knowledge distillation. However, the relationship between these factors has been overlooked in previous research. To address this gap, we propose a new weakly supervised learning framework for knowledge distillation in video classification that is designed to improve the efficiency and accuracy of the student model. Our approach leverages the concept of substage-based learning to distill knowledge based on the combination of student substages and the correlation of corresponding substages. We also employ the progressive cascade training method to address the accuracy loss caused by the large capacity gap between the teacher and the student. Additionally, we propose a pseudo-label optimization strategy to improve the initial data label. To optimize the loss functions of different distillation substages during the training process, we introduce a new loss method based on feature distribution. We conduct extensive experiments on both real and simulated data sets, demonstrating that our proposed approach outperforms existing distillation methods in terms of knowledge distillation for video classification tasks. Our proposed substage-based distillation approach has the potential to inform future research on label-efficient learning for video data.
翻译:在视频数据的标签高效学习背景下,蒸馏方法以及师生架构的结构设计对知识蒸馏具有重要影响。然而,先前研究忽视了这些因素之间的关联。为解决这一问题,我们提出了一种针对视频分类知识蒸馏的新型弱监督学习框架,旨在提升学生模型的效率与准确性。该方法基于分阶段学习理念,结合学生子阶段及其对应子阶段的相关性进行知识蒸馏。同时,我们采用渐进式级联训练方法,以缓解教师与学生之间因容量差距过大导致的精度损失。此外,提出一种伪标签优化策略来改进初始数据标签。为优化训练过程中不同蒸馏子阶段的损失函数,我们引入一种基于特征分布的新型损失方法。在真实数据集与模拟数据集上的大量实验表明,所提方法在视频分类任务的知识蒸馏性能上优于现有蒸馏方法。基于子阶段的蒸馏方法有望为未来视频数据标签高效学习的研究提供启示。