RGB-Thermal (T) crowd counting aims to integrate visible-spectrum and thermal infrared information to improve the robustness of crowd density estimation in complex scenes. Although existing studies generally improve counting accuracy through cross-modal feature fusion, most current methods rely on implicit cross-modal fusion strategies and lack explicit modeling of local spatial discrepancies as well as fine-grained characterization of modality reliability at the positional level, thereby limiting the accuracy and interpretability of the fusion process. To address these issues, this paper proposes a two-stage fusion framework, RACANet, a Reliability-Aware Crowd Anchor Network for RGB-T crowd counting. First, we introduce a lightweight cross-modal alignment pretraining stage, which explicitly learns cross-modal semantic correspondences through crowd-prior supervision and local bidirectional soft matching. Then, based on the priors learned during pretraining, a Local Anchor Fusion Module (LAFM) is introduced in the formal training stage. This module generates local semantic anchors by aggregating features from highly reliable regions and further enables adaptive pixel-level feature redistribution with a local attention mechanism. In addition, we propose a discrepancy-aware consistency constraint to dynamically coordinate the reliability of regions where modal representations are consistent. Experiments conducted on two widely used benchmark datasets, RGBT-CC and Drone-RGBT, demonstrate that RACANet outperforms existing methods. The anonymous code is available at https://anonymous.4open.science/r/RACANet-9985.
翻译:RGB-热红外(RGB-T)人群计数旨在融合可见光与热红外信息,以提升复杂场景下人群密度估计的鲁棒性。尽管现有研究通常通过跨模态特征融合提高计数精度,但大多数方法依赖隐式跨模态融合策略,缺乏对局部空间差异的显式建模以及对位置层面模态可靠性的细粒度刻画,从而限制了融合过程的准确性和可解释性。针对上述问题,本文提出一种两阶段融合框架RACANet,即面向RGB-T人群计数的可靠性感知人群锚点网络。首先,我们引入轻量级跨模态对齐预训练阶段,通过人群先验监督和局部双向软匹配显式学习跨模态语义对应关系。其次,基于预训练阶段习得的先验,在正式训练阶段引入局部锚点融合模块(LAFM)。该模块通过聚合高可靠性区域的特征生成局部语义锚点,并进一步利用局部注意力机制实现自适应逐像素特征重分配。此外,我们提出差异感知一致性约束,以动态协调模态表示一致区域的可靠性。在RGBT-CC和Drone-RGBT两个广泛使用的基准数据集上的实验表明,RACANet优于现有方法。匿名代码已公开于https://anonymous.4open.science/r/RACANet-9985。