Gestures are a key component of non-verbal communication in traffic, often helping pedestrian-to-driver interactions when formal traffic rules may be insufficient. This problem becomes more apparent when autonomous vehicles (AVs) struggle to interpret such gestures. In this study, we present a gesture classification framework using 2D pose estimation applied to real-world video sequences from the WIVW dataset. We categorise gestures into four primary classes (Stop, Go, Thank & Greet, and No Gesture) and extract 76 static and dynamic features from normalised keypoints. Our analysis demonstrates that hand position and movement velocity are especially discriminative in distinguishing between gesture classes, achieving a classification accuracy score of 87%. These findings not only improve the perceptual capabilities of AV systems but also contribute to the broader understanding of pedestrian behaviour in traffic contexts.
翻译:手势是交通中非语言交流的关键组成部分,当正式交通规则可能不足时,常有助于行人与驾驶员的互动。当自动驾驶车辆难以解读此类手势时,这一问题变得更为明显。在本研究中,我们提出了一种手势分类框架,利用二维姿态估计技术处理来自WIVW数据集的真实世界视频序列。我们将手势分为四个主要类别(停止、通行、致谢与问候,以及无手势),并从归一化的关键点中提取了76个静态与动态特征。我们的分析表明,手部位置与移动速度在区分不同手势类别时具有特别强的判别力,分类准确率达到87%。这些发现不仅提升了自动驾驶系统的感知能力,也为更广泛地理解交通情境中的行人行为做出了贡献。