In architectural interior design, miscommunication frequently arises as clients lack design knowledge, while designers struggle to explain complex spatial relationships, leading to delayed timelines and financial losses. Recent advancements in generative layout tools narrow the gap by automating 3D visualizations. However, prevailing methodologies exhibit limitations: rule-based systems implement hard-coded spatial constraints that restrict participatory engagement, while data-driven models rely on extensive training datasets. Recent large language models (LLMs) bridge this gap by enabling intuitive reasoning about spatial relationships through natural language. This research presents an LLM-based, multimodal, multi-agent framework that dynamically converts natural language descriptions and imagery into 3D designs. Specialized agents (Reference, Spatial, Interactive, Grader), operating via prompt guidelines, collaboratively address core challenges: the agent system enables real-time user interaction for iterative spatial refinement, while Retrieval-Augmented Generation (RAG) reduces data dependency without requiring task-specific model training. This framework accurately interprets spatial intent and generates optimized 3D indoor design, improving productivity, and encouraging nondesigner participation. Evaluations across diverse floor plans and user questionnaires demonstrate effectiveness. An independent LLM evaluator consistently rated participatory layouts higher in user intent alignment, aesthetic coherence, functionality, and circulation. Questionnaire results indicated 77% satisfaction and a clear preference over traditional design software. These findings suggest the framework enhances user-centric communication and fosters more inclusive, effective, and resilient design processes. Project page: https://rsigktyper.github.io/AICodesign/
翻译:在建筑室内设计领域,由于客户缺乏设计知识,而设计师又难以解释复杂的空间关系,沟通不畅的情况时常发生,导致项目延期和财务损失。近期生成式布局工具的进展通过自动化生成三维可视化效果,缩小了这一差距。然而,主流方法存在局限性:基于规则的系统采用硬编码的空间约束,限制了参与式互动;而数据驱动模型则依赖于大量的训练数据集。近期的大语言模型通过自然语言实现对空间关系的直观推理,从而弥合了这一鸿沟。本研究提出了一种基于大语言模型的多模态、多智能体框架,能够动态地将自然语言描述和图像转换为三维设计。通过提示指南运作的专门智能体(参考、空间、交互、评分)协同应对核心挑战:该智能体系统支持实时用户交互以进行迭代式空间优化,而检索增强生成技术则降低了数据依赖性,无需进行任务特定的模型训练。该框架能够准确解读空间意图并生成优化的三维室内设计,从而提升生产效率并鼓励非设计师参与。通过对多样化平面图和用户问卷的评估,证明了该框架的有效性。一个独立的大语言模型评估器一致认为参与式布局在用户意图契合度、美学连贯性、功能性和动线流畅性方面评分更高。问卷结果显示,用户满意度达77%,且明显优于传统设计软件。这些发现表明,该框架增强了以用户为中心的沟通,并促进了更具包容性、更有效且更具韧性的设计流程。项目页面:https://rsigktyper.github.io/AICodesign/