To mitigate the challenges arising from partial occlusion in human pose keypoint based pedestrian detection methods , we present a novel pedestrian pose keypoint completion method called the separation and dimensionality reduction-based generative adversarial imputation networks (SDR-GAIN) . Firstly, we utilize OpenPose to estimate pedestrian poses in images. Then, we isolate the head and torso keypoints of pedestrians with incomplete keypoints due to occlusion or other factors and perform dimensionality reduction to enhance features and further unify feature distribution. Finally, we introduce two generative models based on the generative adversarial networks (GAN) framework, which incorporate Huber loss, residual structure, and L1 regularization to generate missing parts of the incomplete head and torso pose keypoints of partially occluded pedestrians, resulting in pose completion. Our experiments on MS COCO and JAAD datasets demonstrate that SDR-GAIN outperforms basic GAIN framework, interpolation methods PCHIP and MAkima, machine learning methods k-NN and MissForest in terms of pose completion task. Furthermore, the SDR-GAIN algorithm exhibits a remarkably short running time of approximately 0.4ms and boasts exceptional real-time performance. As such, it holds significant practical value in the domain of autonomous driving, wherein high system response speeds are of paramount importance. Specifically, it excels at rapidly and precisely capturing human pose key points, thus enabling an expanded range of applications for pedestrian detection tasks based on pose key points, including but not limited to pedestrian behavior recognition and prediction.
翻译:为缓解基于人体姿态关键点的行人检测方法中部分遮挡带来的挑战,我们提出了一种新颖的行人姿态关键点补全方法——基于分离与降维的生成对抗插补网络(SDR-GAIN)。首先,我们利用OpenPose估计图像中的行人姿态;接着,针对因遮挡等因素导致关键点不完整的行人,我们分离其头部与躯干关键点,并通过降维增强特征并统一特征分布;最后,我们引入两个基于生成对抗网络(GAN)框架的生成模型,融合Huber损失、残差结构及L1正则化,生成部分遮挡行人缺失的头部与躯干姿态关键点,从而实现姿态补全。在MS COCO与JAAD数据集上的实验表明,SDR-GAIN在姿态补全任务上优于基础GAIN框架、插值方法PCHIP与MAkima、机器学习方法k-NN与MissForest。此外,SDR-GAIN算法运行时间极短(约0.4毫秒),具备卓越的实时性能,因此在系统响应速度至关重要的自动驾驶领域具有重要实用价值,尤其能够快速精准地捕捉人体姿态关键点,从而拓展基于姿态关键点的行人检测任务(包括但不限于行人行为识别与预测)的应用范围。