Aerial object detection is a challenging task, in which one major obstacle lies in the limitations of large-scale data collection and the long-tail distribution of certain classes. Synthetic data offers a promising solution, especially with recent advances in diffusion-based methods like stable diffusion (SD). However, the direct application of diffusion methods to aerial domains poses unique challenges: stable diffusion's optimization for rich ground-level semantics doesn't align with the sparse nature of aerial objects, and the extraction of post-synthesis object coordinates remains problematic. To address these challenges, we introduce a synthetic data augmentation framework tailored for aerial images. It encompasses sparse-to-dense region of interest (ROI) extraction to bridge the semantic gap, fine-tuning the diffusion model with low-rank adaptation (LORA) to circumvent exhaustive retraining, and finally, a Copy-Paste method to compose synthesized objects with backgrounds, providing a nuanced approach to aerial object detection through synthetic data.
翻译:航空目标检测是一项具有挑战性的任务,其主要难点之一在于大规模数据采集的限制以及某些类别的长尾分布问题。合成数据提供了一种有前景的解决方案,特别是近年来基于扩散模型的方法(如稳定扩散模型,SD)取得了显著进展。然而,将扩散模型直接应用于航空领域面临独特挑战:稳定扩散模型针对丰富的地面语义进行优化,与航空目标的稀疏特性不匹配,且合成后的目标坐标提取仍存在问题。为解决这些挑战,我们提出了一种专为航空图像定制的合成数据增强框架。该框架包括:稀疏到稠密感兴趣区域(ROI)提取以弥合语义差距,通过低秩自适应(LORA)微调扩散模型以避免全面重新训练,以及采用复制-粘贴方法将合成目标与背景相结合,从而为基于合成数据的航空目标检测提供了一种精细化的解决方案。