The emergence of generative models has revolutionized the field of remote sensing (RS) image generation. Despite generating high-quality images, existing methods are limited in relying mainly on text control conditions, and thus do not always generate images accurately and stably. In this paper, we propose CRS-Diff, a new RS generative framework specifically tailored for RS image generation, leveraging the inherent advantages of diffusion models while integrating more advanced control mechanisms. Specifically, CRS-Diff can simultaneously support text-condition, metadata-condition, and image-condition control inputs, thus enabling more precise control to refine the generation process. To effectively integrate multiple condition control information, we introduce a new conditional control mechanism to achieve multi-scale feature fusion, thus enhancing the guiding effect of control conditions. To our knowledge, CRS-Diff is the first multiple-condition controllable RS generative model. Experimental results in single-condition and multiple-condition cases have demonstrated the superior ability of our CRS-Diff to generate RS images both quantitatively and qualitatively compared with previous methods. Additionally, our CRS-Diff can serve as a data engine that generates high-quality training data for downstream tasks, e.g., road extraction. The code is available at https://github.com/Sonettoo/CRS-Diff.
翻译:生成模型的出现彻底改变了遥感图像生成领域。尽管现有方法能够生成高质量图像,但其主要局限于依赖文本控制条件,因此并非总能准确且稳定地生成图像。本文提出CRS-Diff,一种专为遥感图像生成设计的新型遥感生成框架,它利用扩散模型的固有优势,同时集成了更先进的控制机制。具体而言,CRS-Diff能同时支持文本条件、元数据条件和图像条件控制输入,从而实现更精确的控制以优化生成过程。为有效整合多条件控制信息,我们引入了一种新的条件控制机制以实现多尺度特征融合,从而增强控制条件的引导效果。据我们所知,CRS-Diff是首个多条件可控的遥感生成模型。在单条件和多条件场景下的实验结果表明,与先前方法相比,我们的CRS-Diff在定量和定性评估中均展现出卓越的遥感图像生成能力。此外,我们的CRS-Diff可作为数据引擎,为下游任务(如道路提取)生成高质量的训练数据。代码公开于https://github.com/Sonettoo/CRS-Diff。