Gait recognition stands as one of the most pivotal remote identification technologies and progressively expands across research and industrial communities. However, existing gait recognition methods heavily rely on task-specific upstream driven by supervised learning to provide explicit gait representations, which inevitably introduce expensive annotation costs and potentially cause cumulative errors. Escaping from this trend, this work explores effective gait representations based on the all-purpose knowledge produced by task-agnostic Large Vision Models (LVMs) and proposes a simple yet efficient gait framework, termed BigGait. Specifically, the Gait Representation Extractor (GRE) in BigGait effectively transforms all-purpose knowledge into implicit gait features in an unsupervised manner, drawing from design principles of established gait representation construction approaches. Experimental results on CCPG, CAISA-B* and SUSTech1K indicate that BigGait significantly outperforms the previous methods in both self-domain and cross-domain tasks in most cases, and provides a more practical paradigm for learning the next-generation gait representation. Eventually, we delve into prospective challenges and promising directions in LVMs-based gait recognition, aiming to inspire future work in this emerging topic. The source code will be available at https://github.com/ShiqiYu/OpenGait.
翻译:步态识别作为最重要的远程身份识别技术之一,正逐步在研究界和工业界扩展应用。然而,现有步态识别方法严重依赖监督学习驱动的任务特定上游模块来提供显式步态表征,这不可避免地带来高昂的标注成本并可能引发累积误差。为突破这一范式,本文基于任务无关的大型视觉模型(Large Vision Models, LVMs)产生的通用知识探索有效步态表征,并提出一个简单高效的步态框架BigGait。具体而言,BigGait中的步态表征提取器(Gait Representation Extractor, GRE)借鉴成熟步态表征构建方法的设计原则,以无监督方式将通用知识有效转化为隐式步态特征。在CCPG、CAISA-B*和SUSTech1K数据集上的实验结果表明,BigGait在多数情况下于自域和跨域任务中均显著超越现有方法,并为学习下一代步态表征提供了更实用的范式。最后,本文深入探讨了基于LVMs的步态识别所面临的潜在挑战与具有前景的研究方向,旨在为该新兴课题的未来研究提供启示。源代码将发布在https://github.com/ShiqiYu/OpenGait。