Cut-and-Paste with Precision: a Content and Perspective-aware Data Augmentation for Road Damage Detection

Damage to road pavement can develop into cracks, potholes, spallings, and other issues posing significant challenges to the integrity, safety, and durability of the road structure. Detecting and monitoring the evolution of these damages is crucial for maintaining the condition and structural health of road infrastructure. In recent years, researchers have explored various data-driven methods for image-based damage detection in road monitoring applications. The field gained attention with the introduction of the Road Damage Detection Challenge (RDDC2018), encouraging competition in developing object detectors on street-view images from various countries. Leading teams have demonstrated the effectiveness of ensemble models, mostly based on the YOLO and Faster R-CNN series. Data augmentations have also shown benefits in object detection within the computer vision field, including transformations such as random flipping, cropping, cutting out patches, as well as cut-and-pasting object instances. Applying cut-and-paste augmentation to road damages appears to be a promising approach to increase data diversity. However, the standard cut-and-paste technique, which involves sampling an object instance from a random image and pasting it at a random location onto the target image, has demonstrated limited effectiveness for road damage detection. This method overlooks the location of the road and disregards the difference in perspective between the sampled damage and the target image, resulting in unrealistic augmented images. In this work, we propose an improved Cut-and-Paste augmentation technique that is both content-aware (i.e. considers the true location of the road in the image) and perspective-aware (i.e. takes into account the difference in perspective between the injected damage and the target image).

翻译：路面损伤可能发展为裂缝、坑洼、剥落等问题，对道路结构的完整性、安全性和耐久性构成重大挑战。检测并监测这些损伤的演变对于维护道路基础设施的状况和结构健康至关重要。近年来，研究人员探索了多种数据驱动方法，用于道路监测应用中的基于图像的损伤检测。随着道路损伤检测挑战赛（RDDC2018）的引入，该领域获得了广泛关注，该赛事鼓励在来自不同国家的街景图像上开发目标检测器的竞争。领先团队已证明了集成模型的有效性，这些模型主要基于YOLO和Faster R-CNN系列。数据增强在计算机视觉领域的目标检测中也显示出优势，包括随机翻转、裁剪、剪切图像块以及剪贴目标实例等变换。将剪贴增强应用于道路损伤似乎是增加数据多样性的一种有前景的方法。然而，标准的剪贴技术（涉及从随机图像中采样一个损伤实例并将其粘贴到目标图像的随机位置）在道路损伤检测中表现出有限的有效性。该方法忽略了道路在图像中的位置，并且未考虑采样损伤与目标图像之间的视角差异，导致生成的增强图像不真实。在本工作中，我们提出了一种改进的剪贴增强技术，该技术同时具备内容感知（即考虑图像中道路的真实位置）和视角感知（即考虑注入的损伤与目标图像之间的视角差异）特性。