Knowledge distillation is an effective method for model compression. However, it is still a challenging topic to apply knowledge distillation to detection tasks. There are two key points resulting in poor distillation performance for detection tasks. One is the serious imbalance between foreground and background features, another one is that small object lacks enough feature representation. To solve the above issues, we propose a new distillation method named dual relation knowledge distillation (DRKD), including pixel-wise relation distillation and instance-wise relation distillation. The pixel-wise relation distillation embeds pixel-wise features in the graph space and applies graph convolution to capture the global pixel relation. By distilling the global pixel relation, the student detector can learn the relation between foreground and background features, and avoid the difficulty of distilling features directly for the feature imbalance issue. Besides, we find that instance-wise relation supplements valuable knowledge beyond independent features for small objects. Thus, the instance-wise relation distillation is designed, which calculates the similarity of different instances to obtain a relation matrix. More importantly, a relation filter module is designed to highlight valuable instance relations. The proposed dual relation knowledge distillation is general and can be easily applied for both one-stage and two-stage detectors. Our method achieves state-of-the-art performance, which improves Faster R-CNN based on ResNet50 from 38.4% to 41.6% mAP and improves RetinaNet based on ResNet50 from 37.4% to 40.3% mAP on COCO 2017.
翻译:知识蒸馏是模型压缩的有效方法。然而,将知识蒸馏应用于检测任务仍具有挑战性。导致检测任务蒸馏效果不佳的关键因素有二:一是前景与背景特征之间的严重不平衡,二是小目标缺乏足够的特征表示。为解决上述问题,我们提出了一种新颖的蒸馏方法——双重关系知识蒸馏(DRKD),包含像素级关系蒸馏与实例级关系蒸馏。像素级关系蒸馏将像素级特征嵌入图空间,并利用图卷积捕获全局像素关系。通过蒸馏全局像素关系,学生检测器能够学习前景与背景特征之间的关联,从而避免因特征不平衡问题直接蒸馏特征带来的困难。此外,我们发现实例级关系为小目标提供了超越独立特征的有价值知识。因此,我们设计了实例级关系蒸馏,通过计算不同实例间的相似度获得关系矩阵。更重要的是,设计了关系过滤模块以突出有价值的实例关系。所提出的双重关系知识蒸馏具有通用性,可轻松应用于单阶段和两阶段检测器。我们的方法实现了最先进的性能,在COCO 2017数据集上,将基于ResNet50的Faster R-CNN从38.4% mAP提升至41.6% mAP,将基于ResNet50的RetinaNet从37.4% mAP提升至40.3% mAP。