Gait recognition is a biometric technology that distinguishes individuals by their walking patterns. However, previous methods face challenges when accurately extracting identity features because they often become entangled with non-identity clues. To address this challenge, we propose CLTD, a causality-inspired discriminative feature learning module designed to effectively eliminate the influence of confounders in triple domains, \ie, spatial, temporal, and spectral. Specifically, we utilize the Cross Pixel-wise Attention Generator (CPAG) to generate attention distributions for factual and counterfactual features in spatial and temporal domains. Then, we introduce the Fourier Projection Head (FPH) to project spatial features into the spectral space, which preserves essential information while reducing computational costs. Additionally, we employ an optimization method with contrastive learning to enforce semantic consistency constraints across sequences from the same subject. Our approach has demonstrated significant performance improvements on challenging datasets, proving its effectiveness. Moreover, it can be seamlessly integrated into existing gait recognition methods.
翻译:步态识别是一种通过行走模式识别个体的生物特征技术。然而,现有方法在准确提取身份特征时面临挑战,因为这些特征常与非身份线索相互纠缠。为解决这一问题,我们提出CLTD,一种受因果关系启发的判别性特征学习模块,旨在有效消除三重域(即空间域、时域和频域)中混杂因素的影响。具体而言,我们利用跨像素注意力生成器(CPAG)为空间域和时域中的事实特征与反事实特征生成注意力分布。随后,我们引入傅里叶投影头(FPH)将空间特征投影至谱空间,在保留关键信息的同时降低计算成本。此外,我们采用结合对比学习的优化方法,对同一受试者的多段序列施加语义一致性约束。我们的方法在多个挑战性数据集上取得了显著的性能提升,证明了其有效性。该模块还可无缝集成到现有步态识别方法中。