Synthetic Datasets for Autonomous Driving: A Survey

Autonomous driving techniques have been flourishing in recent years while thirsting for huge amounts of high-quality data. However, it is difficult for real-world datasets to keep up with the pace of changing requirements due to their expensive and time-consuming experimental and labeling costs. Therefore, more and more researchers are turning to synthetic datasets to easily generate rich and changeable data as an effective complement to the real world and to improve the performance of algorithms. In this paper, we summarize the evolution of synthetic dataset generation methods and review the work to date in synthetic datasets related to single and multi-task categories for to autonomous driving study. We also discuss the role that synthetic dataset plays the evaluation, gap test, and positive effect in autonomous driving related algorithm testing, especially on trustworthiness and safety aspects. Finally, we discuss general trends and possible development directions. To the best of our knowledge, this is the first survey focusing on the application of synthetic datasets in autonomous driving. This survey also raises awareness of the problems of real-world deployment of autonomous driving technology and provides researchers with a possible solution.

翻译：近年来，自动驾驶技术蓬勃发展，对海量高质量数据的需求日益迫切。然而，真实世界数据集因实验和标注成本高昂且耗时，难以跟上需求变化的步伐。因此，越来越多的研究者转向合成数据集，以便捷地生成丰富多变的数据，作为对真实世界的有效补充，并提升算法性能。本文总结了合成数据集生成方法的演进过程，并回顾了迄今为止与自动驾驶研究中单任务及多任务类别相关的合成数据集工作。我们还探讨了合成数据集在自动驾驶相关算法测试（特别是在可信性与安全性方面）的评估、差距测试及积极作用中所扮演的角色。最后，我们讨论了总体趋势与可能的发展方向。据我们所知，这是首篇聚焦于合成数据集在自动驾驶中应用的综述。本综述也旨在提升对自动驾驶技术实际部署问题的认识，并为研究者提供一种可能的解决方案。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日