Existing pose estimation models perform poorly on wheelchair users due to a lack of representation in training data. We present a data synthesis pipeline to address this disparity in data collection and subsequently improve pose estimation performance for wheelchair users. Our configurable pipeline generates synthetic data of wheelchair users using motion capture data and motion generation outputs simulated in the Unity game engine. We validated our pipeline by conducting a human evaluation, investigating perceived realism, diversity, and an AI performance evaluation on a set of synthetic datasets from our pipeline that synthesized different backgrounds, models, and postures. We found our generated datasets were perceived as realistic by human evaluators, had more diversity than existing image datasets, and had improved person detection and pose estimation performance when fine-tuned on existing pose estimation models. Through this work, we hope to create a foothold for future efforts in tackling the inclusiveness of AI in a data-centric and human-centric manner with the data synthesis techniques demonstrated in this work. Finally, for future works to extend upon, we open source all code in this research and provide a fully configurable Unity Environment used to generate our datasets. In the case of any models we are unable to share due to redistribution and licensing policies, we provide detailed instructions on how to source and replace said models.
翻译:现有姿态估计模型因训练数据中缺乏对轮椅使用者的表征,在应用于该群体时性能较差。我们提出一种数据合成流程,以解决数据收集中的这一不均衡问题,进而提升针对轮椅使用者的姿态估计性能。该可配置流程利用运动捕捉数据及Unity游戏引擎模拟生成的运动输出,合成轮椅使用者的虚拟数据。我们通过人工评估、感知真实度与多样性调查,以及基于本流程生成的、包含不同背景、模型与姿态的合成数据集所开展的AI性能评估,对该流程进行了验证。结果发现,由我们生成的数据集被人工评估者认为具有真实感,相比现有图像数据集更具多样性,且经微调后能够提升现有姿态估计模型的人物检测与姿态估计性能。通过本研究,我们希望借助所展示的数据合成技术,为以数据为中心、以人为本的方式应对AI包容性问题奠定基础。最后,为便于后续研究拓展,我们开源了本研究的全部代码,并提供了一套完全可配置的Unity环境,用于生成我们的数据集。对于因重新分发与许可政策而无法共享的模型,我们提供了关于如何获取并替换相应模型的详细说明。