Gait recognition is a biometric technology that has received extensive attention. Most existing gait recognition algorithms are unimodal, and a few multimodal gait recognition algorithms perform multimodal fusion only once. None of these algorithms may fully exploit the complementary advantages of the multiple modalities. In this paper, by considering the temporal and spatial characteristics of gait data, we propose a multi-stage feature fusion strategy (MSFFS), which performs multimodal fusions at different stages in the feature extraction process. Also, we propose an adaptive feature fusion module (AFFM) that considers the semantic association between silhouettes and skeletons. The fusion process fuses different silhouette areas with their more related skeleton joints. Since visual appearance changes and time passage co-occur in a gait period, we propose a multiscale spatial-temporal feature extractor (MSSTFE) to learn the spatial-temporal linkage features thoroughly. Specifically, MSSTFE extracts and aggregates spatial-temporal linkages information at different spatial scales. Combining the strategy and modules mentioned above, we propose a multi-stage adaptive feature fusion (MSAFF) neural network, which shows state-of-the-art performance in many experiments on three datasets. Besides, MSAFF is equipped with feature dimensional pooling (FD Pooling), which can significantly reduce the dimension of the gait representations without hindering the accuracy. https://github.com/ShinanZou/MSAFF
翻译:步态识别是一种受到广泛关注的生物识别技术。现有大多数步态识别算法为单模态方法,少数多模态步态识别算法仅执行一次多模态融合。这些算法均未能充分利用多模态间的互补优势。本文通过考虑步态数据的时空特性,提出了一种多阶段特征融合策略(MSFFS),在特征提取过程的不同阶段执行多模态融合。同时,我们设计了一种自适应特征融合模块(AFFM),该模块能够考虑轮廓与骨架之间的语义关联,将不同轮廓区域与其更相关的骨骼关节进行融合。由于视觉外观变化与时间流逝在步态周期中同步发生,我们提出了一种多尺度时空特征提取器(MSSTFE),以充分学习时空关联特征。具体而言,MSSTFE在不同空间尺度上提取并聚合时空关联信息。结合上述策略与模块,我们提出了多阶段自适应特征融合(MSAFF)神经网络,在三个数据集上的多项实验中展现出最先进的性能。此外,MSAFF配备了特征维度池化(FD Pooling),可在不影响准确率的前提下显著降低步态表征的维度。https://github.com/ShinanZou/MSAFF