Facade renovation offers a more sustainable alternative to full demolition, yet producing design proposals that preserve existing structures while expressing new intent remains challenging. Current workflows typically require detailed as-built modelling before design, which is time-consuming, labour-intensive, and often involves repeated revisions. To solve this issue, we propose a three-stage framework combining generative artificial intelligence (AI) and vision-language models (VLM) that directly processes rough structural sketch and textual descriptions to produce consistent renovation proposals. First, the input sketch is used by a fine-tuned VLM model to predict bounding boxes specifying where modifications are needed and which components should be added. Next, a stable diffusion model generates detailed sketches of new elements, which are merged with the original outline through a generative inpainting pipeline. Finally, ControlNet is employed to refine the result into a photorealistic image. Experiments on datasets and real industrial buildings indicate that the proposed framework can generate renovation proposals that preserve the original structure while improving facade detail quality. This approach effectively bypasses the need for detailed as-built modelling, enabling architects to rapidly explore design alternatives, iterate on early-stage concepts, and communicate renovation intentions with greater clarity.
翻译:立面改造为全面拆除提供了更具可持续性的替代方案,然而,在保留现有结构的同时表达新设计意图的方案生成仍具挑战性。当前工作流程通常需要在设计前进行详细的竣工建模,这一过程耗时耗力,且往往涉及反复修改。为解决此问题,我们提出一个结合生成式人工智能(AI)与视觉-语言模型(VLM)的三阶段框架,该框架可直接处理粗略的结构草图与文本描述,以生成一致的改造方案。首先,通过微调的VLM模型利用输入草图预测边界框,以指定需要修改的位置及应添加的组件。接着,稳定扩散模型生成新元素的详细草图,并通过生成式修复流程将其与原始轮廓融合。最后,采用ControlNet将结果细化为逼真的图像。在数据集和真实工业建筑上的实验表明,所提框架能够生成在保留原始结构的同时提升立面细节质量的改造方案。该方法有效绕过了详细竣工建模的需求,使建筑师能够快速探索设计备选方案、迭代早期概念,并以更清晰的方式传达改造意图。