Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including data augmentation and synthetic data generation. This work explores the use of LLMs to generate rich textual descriptions for motion sequences, encompassing both actions and walking patterns. We leverage the expressive power of LLMs to align motion representations with high-level linguistic cues, addressing two distinct tasks: action recognition and retrieval of walking sequences based on appearance attributes. For action recognition, we employ LLMs to generate textual descriptions of actions in the BABEL-60 dataset, facilitating the alignment of motion sequences with linguistic representations. In the domain of gait analysis, we investigate the impact of appearance attributes on walking patterns by generating textual descriptions of motion sequences from the DenseGait dataset using LLMs. These descriptions capture subtle variations in walking styles influenced by factors such as clothing choices and footwear. Our approach demonstrates the potential of LLMs in augmenting structured motion attributes and aligning multi-modal representations. The findings contribute to the advancement of comprehensive motion understanding and open up new avenues for leveraging LLMs in multi-modal alignment and data augmentation for motion analysis. We make the code publicly available at https://github.com/Radu1999/WalkAndText
翻译:大型语言模型(LLMs)在数据增强和合成数据生成等多个领域展现了卓越的能力。本研究探索利用LLMs为运动序列生成丰富的文本描述,涵盖动作和行走模式。我们借助LLM的表达能力,将运动表征与高级语言线索对齐,涉及两个不同任务:动作识别和基于外观属性的行走序列检索。在动作识别方面,我们利用LLMs为BABEL-60数据集中的动作生成文本描述,促进运动序列与语言表征的对齐。在步态分析领域,我们通过LLMs为DenseGait数据集中的运动序列生成文本描述,研究外观属性对行走模式的影响。这些描述捕捉了受服装选择和鞋履等因素影响的行走风格的细微变化。我们的方法展示了LLMs在增强结构化运动属性及对齐多模态表征方面的潜力。研究结果有助于推进对运动的全面理解,并为利用LLMs进行运动分析的多模态对齐与数据增强开辟了新路径。相关代码已公开于https://github.com/Radu1999/WalkAndText。