Gait recognition is a biometric technology that identifies individuals by using walking patterns. Due to the significant achievements of multimodal fusion in gait recognition, we consider employing LiDAR-camera fusion to obtain robust gait representations. However, existing methods often overlook intrinsic characteristics of modalities, and lack fine-grained fusion and temporal modeling. In this paper, we introduce a novel modality-sensitive network LiCAF for LiDAR-camera fusion, which employs an asymmetric modeling strategy. Specifically, we propose Asymmetric Cross-modal Channel Attention (ACCA) and Interlaced Cross-modal Temporal Modeling (ICTM) for cross-modal valuable channel information selection and powerful temporal modeling. Our method achieves state-of-the-art performance (93.9% in Rank-1 and 98.8% in Rank-5) on the SUSTech1K dataset, demonstrating its effectiveness.
翻译:步态识别是一种利用行走模式进行身份识别的生物识别技术。鉴于多模态融合在步态识别领域取得的显著成就,我们考虑采用激光雷达与相机融合以获取鲁棒的步态表征。然而,现有方法往往忽略模态的内在特性,且缺乏细粒度融合与时序建模。本文提出一种新颖的模态敏感网络LiCAF用于激光雷达-相机融合,该网络采用非对称建模策略。具体而言,我们提出了非对称跨模态通道注意力机制与交错跨模态时序建模方法,分别用于跨模态有价值通道信息筛选和强时序建模。我们的方法在SUSTech1K数据集上取得了最先进的性能(Rank-1准确率93.9%,Rank-5准确率98.8%),验证了其有效性。