The advent of data-driven technology solutions is accompanied by an increasing concern with data privacy. This is of particular importance for human-centered image recognition tasks, such as pedestrian detection, re-identification, and tracking. To highlight the importance of privacy issues and motivate future research, we motivate and introduce the Pedestrian Dataset De-Identification (PDI) task. PDI evaluates the degree of de-identification and downstream task training performance for a given de-identification method. As a first baseline, we propose IncogniMOT, a two-stage full-body de-identification pipeline based on image synthesis via generative adversarial networks. The first stage replaces target pedestrians with synthetic identities. To improve downstream task performance, we then apply stage two, which blends and adapts the synthetic image parts into the data. To demonstrate the effectiveness of IncogniMOT, we generate a fully de-identified version of the MOT17 pedestrian tracking dataset and analyze its application as training data for pedestrian re-identification, detection, and tracking models. Furthermore, we show how our data is able to narrow the synthetic-to-real performance gap in a privacy-conscious manner.
翻译:数据驱动技术解决方案的出现伴随着对数据隐私日益增长的关注。这对于以人为中心的图像识别任务(如行人检测、重识别和追踪)尤为重要。为了强调隐私问题的重要性并激励未来研究,我们提出并引入行人数据集去标识化(PDI)任务。PDI评估给定去标识化方法的去标识化程度及下游任务训练性能。作为首个基线,我们提出IncogniMOT,一种基于生成对抗网络图像合成的两阶段全身去标识化流水线。第一阶段将目标行人替换为合成身份。为提升下游任务性能,我们随后应用第二阶段,将合成图像部分融合并适配到数据中。为证明IncogniMOT的有效性,我们生成了MOT17行人追踪数据集的完全去标识化版本,并分析其作为行人重识别、检测和追踪模型训练数据的应用效果。此外,我们展示了数据如何以隐私敏感的方式缩小合成到真实数据的性能差距。