This paper presents a new approach for training two-stage object detection ensemble models, more specifically, Faster R-CNN models to estimate uncertainty. We propose training one Region Proposal Network(RPN)~\cite{https://doi.org/10.48550/arxiv.1506.01497} and multiple Fast R-CNN prediction heads is all you need to build a robust deep ensemble network for estimating uncertainty in object detection. We present this approach and provide experiments to show that this approach is much faster than the naive method of fully training all $n$ models in an ensemble. We also estimate the uncertainty by measuring this ensemble model's Expected Calibration Error (ECE). We then further compare the performance of this model with that of Gaussian YOLOv3, a variant of YOLOv3 that models uncertainty using predicted bounding box coordinates. The source code is released at \url{https://github.com/Akola-Mbey-Denis/EfficientEnsemble}
翻译:本文提出了一种训练两阶段目标检测集成模型(更具体地,Faster R-CNN模型)以估计不确定性的新方法。我们提出,训练一个区域提议网络(RPN)~\cite{https://doi.org/10.48550/arxiv.1506.01497}和多个Fast R-CNN预测头,便是构建用于估计目标检测不确定性的稳健深度集成网络所需的一切。我们展示了这一方法,并通过实验证明,该方法比完全训练集成中所有$n$个模型的朴素方法快得多。我们还通过测量该集成模型的期望校准误差(ECE)来估计不确定性。随后,我们进一步将该模型的性能与高斯YOLOv3(一种利用预测边界框坐标建模不确定性的YOLOv3变体)进行比较。源代码已发布在\url{https://github.com/Akola-Mbey-Denis/EfficientEnsemble}。