DEtection TRansformer (DETR) and its variants (DETRs) have been successfully applied to crowded pedestrian detection, which achieved promising performance. However, we find that, in different degrees of crowded scenes, the number of DETRs' queries must be adjusted manually, otherwise, the performance would degrade to varying degrees. In this paper, we first analyze the two current query generation methods and summarize four guidelines for designing the adaptive query generation method. Then, we propose Rank-based Adaptive Query Generation (RAQG) to alleviate the problem. Specifically, we design a rank prediction head that can predict the rank of the lowest confidence positive training sample produced by the encoder. Based on the predicted rank, we design an adaptive selection method that can adaptively select coarse detection results produced by the encoder to generate queries. Moreover, to train the rank prediction head better, we propose Soft Gradient L1 Loss. The gradient of Soft Gradient L1 Loss is continuous, which can describe the relationship between the loss value and the updated value of model parameters granularly. Our method is simple and effective, which can be plugged into any DETRs to make it query-adaptive in theory. The experimental results on Crowdhuman dataset and Citypersons dataset show that our method can adaptively generate queries for DETRs and achieve competitive results. Especially, our method achieves state-of-the-art 39.4% MR on Crowdhuman dataset.
翻译:DEtection TRansformer(DETR)及其变体(DETRs)已成功应用于密集行人检测任务,取得了令人满意的性能。然而,我们发现,在不同拥挤程度的场景中,DETRs的查询数量必须手动调整,否则性能会不同程度地下降。本文首先分析了两种现有的查询生成方法,并总结了设计自适应查询生成方法的四条准则。随后,我们提出基于排序的自适应查询生成方法(Rank-based Adaptive Query Generation, RAQG)以缓解该问题。具体而言,我们设计了一个排序预测头,用于预测编码器产生的置信度最低的正训练样本的排序位置。基于预测的排序,我们设计了一种自适应选择方法,能够自适应地选取编码器产生的粗检测结果来生成查询。此外,为更好地训练排序预测头,我们提出了软梯度L1损失(Soft Gradient L1 Loss)。该损失的梯度具有连续性,可精细刻画损失值与模型参数更新量之间的关系。我们的方法简单有效,理论上可嵌入任意DETRs模型实现查询自适应。在Crowdhuman数据集和Citypersons数据集上的实验结果表明,该方法能够为DETRs自适应生成查询并取得具有竞争力的结果。特别地,本方法在Crowdhuman数据集上实现了39.4%的漏检率(MR),达到当前最优水平。