RGBT multispectral pedestrian detection has emerged as a promising solution for safety-critical applications that require day/night operations. However, the modality bias problem remains unsolved as multispectral pedestrian detectors learn the statistical bias in datasets. Specifically, datasets in multispectral pedestrian detection mainly distribute between ROTO (day) and RXTO (night) data; the majority of the pedestrian labels statistically co-occur with their thermal features. As a result, multispectral pedestrian detectors show poor generalization ability on examples beyond this statistical correlation, such as ROTX data. To address this problem, we propose a novel Causal Mode Multiplexer (CMM) framework that effectively learns the causalities between multispectral inputs and predictions. Moreover, we construct a new dataset (ROTX-MP) to evaluate modality bias in multispectral pedestrian detection. ROTX-MP mainly includes ROTX examples not presented in previous datasets. Extensive experiments demonstrate that our proposed CMM framework generalizes well on existing datasets (KAIST, CVC-14, FLIR) and the new ROTX-MP. We will release our new dataset to the public for future research.
翻译:RGBT多光谱行人检测已成为需要昼夜运行的安全关键应用中的一种有前景的解决方案。然而,由于多光谱行人检测器会学习数据集中的统计偏差,模态偏差问题仍未得到解决。具体而言,多光谱行人检测数据集主要分布在ROTO(白天)和RXTO(夜间)数据之间;大多数行人标签在统计上与热特征同时出现。因此,多光谱行人检测器在超出此统计相关性的样本(如ROTX数据)上表现出较差的泛化能力。针对这一问题,我们提出了一种新型因果模式复用器(CMM)框架,该框架能够有效学习多光谱输入与预测之间的因果关系。此外,我们构建了一个新数据集(ROTX-MP)用于评估多光谱行人检测中的模态偏差。ROTX-MP主要包含先前数据集中未出现的ROTX样本。大量实验表明,我们提出的CMM框架在现有数据集(KAIST、CVC-14、FLIR)和新的ROTX-MP上均具有良好的泛化能力。我们将向公众发布这一新数据集,以供未来研究使用。