Let's Roll: Synthetic Dataset Analysis for Pedestrian Detection Across Different Shutter Types

Computer vision (CV) pipelines are typically evaluated on datasets processed by image signal processing (ISP) pipelines even though, for resource-constrained applications, an important research goal is to avoid as many ISP steps as possible. In particular, most CV datasets consist of global shutter (GS) images even though most cameras today use a rolling shutter (RS). This paper studies the impact of different shutter mechanisms on machine learning (ML) object detection models on a synthetic dataset that we generate using the advanced simulation capabilities of Unreal Engine 5 (UE5). In particular, we train and evaluate mainstream detection models with our synthetically-generated paired GS and RS datasets to ascertain whether there exists a significant difference in detection accuracy between these two shutter modalities, especially when capturing low-speed objects (e.g., pedestrians). The results of this emulation framework indicate the performance between them are remarkably congruent for coarse-grained detection (mean average precision (mAP) for IOU=0.5), but have significant differences for fine-grained measures of detection accuracy (mAP for IOU=0.5:0.95). This implies that ML pipelines might not need explicit correction for RS for many object detection applications, but mitigating RS effects in ISP-less ML pipelines that target fine-grained location of the objects may need additional research.

翻译：计算机视觉（CV）流水线通常基于经过图像信号处理（ISP）流水线处理的数据集进行评估，然而在资源受限的应用中，一个重要研究目标是尽可能避免ISP步骤。特别是，大多数CV数据集由全局快门（GS）图像组成，而如今大多数相机使用滚动快门（RS）。本文研究不同快门机制对机器学习（ML）目标检测模型的影响，基于我们利用虚幻引擎5（UE5）先进仿真能力生成的合成数据集。具体而言，我们通过合成生成的配对GS和RS数据集训练并评估主流检测模型，以确定这两种快门模式在检测精度上是否存在显著差异，尤其是在捕捉低速物体（如行人）时。该仿真框架的结果表明，对于粗粒度检测（交并比IOU=0.5的平均精度mAP），二者性能高度一致，但在细粒度检测精度（IOU=0.5:0.95的mAP）上存在显著差异。这意味着对于许多目标检测应用，ML流水线可能无需显式校正RS效应，但在无ISP的ML流水线中，针对目标精确定位的RS效应缓解仍需进一步研究。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日