The recent development of deep learning methods applied to vision has enabled their increasing integration into real-world applications to perform complex Computer Vision (CV) tasks. However, image acquisition conditions have a major impact on the performance of high-level image processing. A possible solution to overcome these limitations is to artificially augment the training databases or to design deep learning models that are robust to signal distortions. We opt here for the first solution by enriching the database with complex and realistic distortions which were ignored until now in the existing databases. To this end, we built a new versatile database derived from the well-known MS-COCO database to which we applied local and global photo-realistic distortions. These new local distortions are generated by considering the scene context of the images that guarantees a high level of photo-realism. Distortions are generated by exploiting the depth information of the objects in the scene as well as their semantics. This guarantees a high level of photo-realism and allows to explore real scenarios ignored in conventional databases dedicated to various CV applications. Our versatile database offers an efficient solution to improve the robustness of various CV tasks such as Object Detection (OD), scene segmentation, and distortion-type classification methods. The image database, scene classification index, and distortion generation codes are publicly available \footnote{\url{https://github.com/Aymanbegh/CD-COCO}}
翻译:近年来,深度学习方法在视觉领域的应用推动了其逐步融入实际系统,以执行复杂的计算机视觉(CV)任务。然而,图像采集条件对高层图像处理的性能具有显著影响。克服这些限制的可行方案包括人工扩展训练数据库,或设计对信号畸变具有鲁棒性的深度学习模型。本文选择前者,通过为现有数据库补充此前被忽略的复杂且逼真的畸变来增强数据库。为此,我们基于著名的MS-COCO数据库构建了一个新的通用数据库,并对其施加了局部与全局的光照逼真畸变。这些新型局部畸变通过考虑图像场景上下文生成,从而确保高水平的视觉真实性。畸变生成过程利用了场景中物体的深度信息及其语义特征,这不仅保证了高度逼真性,还能探索传统面向各类CV应用的数据库所忽略的真实场景。我们的通用数据库为提升目标检测(OD)、场景分割及畸变类型分类等多种CV任务的鲁棒性提供了高效解决方案。图像数据库、场景分类索引及畸变生成代码均已公开\footnote{\url{https://github.com/Aymanbegh/CD-COCO}}。