UrbanGenAI: Reconstructing Urban Landscapes using Panoptic Segmentation and Diffusion Models

In contemporary design practices, the integration of computer vision and generative artificial intelligence (genAI) represents a transformative shift towards more interactive and inclusive processes. These technologies offer new dimensions of image analysis and generation, which are particularly relevant in the context of urban landscape reconstruction. This paper presents a novel workflow encapsulated within a prototype application, designed to leverage the synergies between advanced image segmentation and diffusion models for a comprehensive approach to urban design. Our methodology encompasses the OneFormer model for detailed image segmentation and the Stable Diffusion XL (SDXL) diffusion model, implemented through ControlNet, for generating images from textual descriptions. Validation results indicated a high degree of performance by the prototype application, showcasing significant accuracy in both object detection and text-to-image generation. This was evidenced by superior Intersection over Union (IoU) and CLIP scores across iterative evaluations for various categories of urban landscape features. Preliminary testing included utilising UrbanGenAI as an educational tool enhancing the learning experience in design pedagogy, and as a participatory instrument facilitating community-driven urban planning. Early results suggested that UrbanGenAI not only advances the technical frontiers of urban landscape reconstruction but also provides significant pedagogical and participatory planning benefits. The ongoing development of UrbanGenAI aims to further validate its effectiveness across broader contexts and integrate additional features such as real-time feedback mechanisms and 3D modelling capabilities. Keywords: generative AI; panoptic image segmentation; diffusion models; urban landscape design; design pedagogy; co-design

翻译：在当代设计实践中，计算机视觉与生成式人工智能（genAI）的融合代表着向更具交互性和包容性设计流程的变革性转变。这些技术为图像分析与生成提供了新的维度，尤其在城市景观重建领域具有重要意义。本文提出了一种整合于原型应用中的创新工作流，旨在利用先进图像分割技术与扩散模型之间的协同效应，实现城市设计的综合方法。我们的方法包括采用OneFormer模型进行精细图像分割，以及通过ControlNet实现的Stable Diffusion XL（SDXL）扩散模型，用于从文本描述生成图像。验证结果表明，该原型应用具有高性能，在目标检测和文本到图像生成方面均展现出显著准确性。通过对各类城市景观特征的迭代评估，其在交并比（IoU）和CLIP评分上均取得了优异结果。初步测试包括将UrbanGenAI用作增强设计教育学学习体验的教育工具，以及作为促进社区驱动型城市规划的参与式工具。早期结果表明，UrbanGenAI不仅推动了城市景观重建的技术前沿，还带来了显著的教学与参与式规划效益。UrbanGenAI的持续开发旨在进一步验证其在更广泛场景中的有效性，并整合实时反馈机制和三维建模能力等附加功能。关键词：生成式人工智能；全景图像分割；扩散模型；城市景观设计；设计教育学；协同设计

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/