The standard approach to tackling computer vision problems is to train deep convolutional neural network (CNN) models using large-scale image datasets which are representative of the target task. However, in many scenarios, it is often challenging to obtain sufficient image data for the target task. Data augmentation is a way to mitigate this challenge. A common practice is to explicitly transform existing images in desired ways so as to create the required volume and variability of training data necessary to achieve good generalization performance. In situations where data for the target domain is not accessible, a viable workaround is to synthesize training data from scratch--i.e., synthetic data augmentation. This paper presents an extensive review of synthetic data augmentation techniques. It covers data synthesis approaches based on realistic 3D graphics modeling, neural style transfer (NST), differential neural rendering, and generative artificial intelligence (AI) techniques such as generative adversarial networks (GANs) and variational autoencoders (VAEs). For each of these classes of methods, we focus on the important data generation and augmentation techniques, general scope of application and specific use-cases, as well as existing limitations and possible workarounds. Additionally, we provide a summary of common synthetic datasets for training computer vision models, highlighting the main features, application domains and supported tasks. Finally, we discuss the effectiveness of synthetic data augmentation methods. Since this is the first paper to explore synthetic data augmentation methods in great detail, we are hoping to equip readers with the necessary background information and in-depth knowledge of existing methods and their attendant issues.
翻译:解决计算机视觉问题的标准方法是使用代表目标任务的大规模图像数据集训练深度卷积神经网络(CNN)模型。然而在许多场景中,为目标任务获取足够的图像数据往往具有挑战性。数据增强是缓解这一挑战的有效途径。常见做法是对现有图像进行特定方式的显式变换,以生成所需的训练数据量和多样性,从而实现良好的泛化性能。当目标域数据不可获取时,可行的替代方案是从零开始合成训练数据——即合成数据增强。本文对合成数据增强技术进行了全面综述,涵盖基于真实感三维图形建模、神经风格迁移(NST)、微分神经渲染以及生成式人工智能(AI)技术(如生成对抗网络GAN和变分自编码器VAE)的数据合成方法。针对每一类方法,我们重点阐述了重要的数据生成与增强技术、通用适用范围及具体用例、现有局限及可能的解决思路。此外,我们总结了用于训练计算机视觉模型的常见合成数据集,突出其主要特征、应用领域及支持任务。最后,我们探讨了合成数据增强方法的有效性。作为首篇深入探索合成数据增强方法的综述论文,我们期望为读者提供必要的背景知识,并使其深入了解现有方法及其相关问题。