Convolutional Neural Networks based object detection techniques produce accurate results but often time consuming. Knowledge distillation has been popular for model compression to speed up. In this paper, we propose a Semi-supervised Adaptive Distillation (SAD) framework to accelerate single-stage detectors while still improving the overall accuracy. We introduce our Adaptive Distillation Loss (ADL) that enables student model to mimic teacher's logits adaptively with more attention paid on two types of hard samples, hard-to-learn samples predicted by teacher model with low certainty and hard-to-mimic samples with a large gap between the teacher's and the student's prediction. We then show that student model can be improved further in the semi-supervised setting with the help of ADL. Our experiments validate that for distillation on unlabeled data. ADL achieves better performance than existing data distillation using both soft and hard targets. On the COCO database, SAD makes a student detector with a backbone of ResNet-50 out-perform its teacher with a backbone of ResNet-101, while the student has half of the teacher's computation complexity.
|Publication status||Published - 2020|
|Event||30th British Machine Vision Conference, BMVC 2019 - Cardiff, United Kingdom|
Duration: 9 Sept 2019 → 12 Sept 2019
|Conference||30th British Machine Vision Conference, BMVC 2019|
|Period||9/09/19 → 12/09/19|
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition