Quadratic Unconstrained Binary Optimization (QUBO)-based suppression in object detection is known to have superiority to conventional Non-Maximum Suppression (NMS), especially for crowded scenes where NMS possibly suppresses the (partially-) occluded true positives with low confidence scores. Whereas existing QUBO formulations are less likely to miss occluded objects than NMS, there is room for improvement because existing QUBO formulations naively consider confidence scores and pairwise scores based on spatial overlap between predictions. This study proposes new QUBO formulations that aim to distinguish whether the overlap between predictions is due to the occlusion of objects or due to redundancy in prediction, i.e., multiple predictions for a single object. The proposed QUBO formulation integrates two features into the pairwise score of the existing QUBO formulation: i) the appearance feature calculated by the image similarity metric and ii) the product of confidence scores. These features are derived from the hypothesis that redundant predictions share a similar appearance feature and (partially-) occluded objects have low confidence scores, respectively. The proposed methods demonstrate significant advancement over state-of-the-art QUBO-based suppression without a notable increase in runtime, achieving up to 4.54 points improvement in mAP and 9.89 points gain in mAR.
翻译:在目标检测中,基于二次无约束二进制优化(QUBO)的抑制方法相较于传统的非极大值抑制(NMS)具有显著优势,尤其在拥挤场景中,NMS可能抑制置信度较低的被(部分)遮挡真实正样本。尽管现有QUBO公式比NMS更不易漏检被遮挡目标,但其仅朴素地考虑置信度得分及基于预测框空间重叠度的成对得分,仍有改进空间。本研究提出新的QUBO公式,旨在区分预测框之间的重叠究竟源于目标间的遮挡,还是源于对单一目标的冗余预测。所提出的QUBO公式将两个特征集成到现有QUBO公式的成对得分中:i) 通过图像相似度度量计算的外观特征;ii) 置信度得分的乘积。这些特征分别基于以下假设推导:冗余预测具有相似的外观特征,而被(部分)遮挡目标则具有较低的置信度得分。所提方法在未显著增加运行时间的前提下,相比最先进的基于QUBO的抑制方法取得显著进步,mAP最高提升4.54个百分点,mAR最高提升9.89个百分点。