Interior design is a requirements-to-visual-plan generation process that must simultaneously satisfy verifiable spatial feasibility and comparative aesthetic preferences. While recent multimodal large language models (MLLMs) offer a unified foundation for interpreting user intent and producing design rationales, our empirical analysis reveals a persistent contradiction in real-world deployment: MLLMs often produce layouts that are unbuildable and aesthetically inconsistent. These findings indicate that simply adding in-domain text is insufficient; effective interior design requires an alignment mechanism that separates hard constraints from soft preferences and coordinates them during optimization. To address this, we propose Design-MLLM, a reinforcement alignment framework that optimizes a feasibility-first preference objective via a dual-branch, aesthetic-oriented reward. Specifically, Design-MLLM (i) explicitly evaluates spatial feasibility using programmatic constraint checks, (ii) assesses aesthetic preference only among feasible candidates to avoid visually appealing but unexecutable shortcuts, and (iii) performs group-relative optimization to obtain stable preference signals. Through this process, Design-MLLM learns a controllable policy that consistently selects and generates solutions that are both executable and aesthetically coherent, rather than occasionally producing visually appealing but infeasible designs. Extensive experiments on various benchmark datasets demonstrate the advantages of Design-MLLM.
翻译:室内设计是一个从需求到视觉方案的生成过程,必须同时满足可验证的空间可行性和可比较的美学偏好。尽管近年来多模态大语言模型(MLLMs)为理解用户意图和生成设计原理提供了统一基础,但我们的实证分析揭示了实际部署中持续存在的矛盾:MLLMs 常常生成不可构建且美学不一致的布局。这些发现表明,仅增加领域内文本是不够的;有效的室内设计需要一种对齐机制,将硬约束与软偏好分离,并在优化过程中协调两者。为此,我们提出 Design-MLLM,一种通过双分支美学导向奖励来优化“可行性优先”偏好目标的强化对齐框架。具体而言,Design-MLLM (i) 利用程序化约束检查显式评估空间可行性,(ii) 仅在可行候选方案中评估美学偏好,以避免生成视觉吸引力强但不可执行的捷径方案,(iii) 执行群体相对优化以获取稳定的偏好信号。通过这一过程,Design-MLLM 学习到一个可控策略,能够持续选择并生成既可执行又美学一致的方案,而非偶发地产生视觉美观但不可行的设计。在多种基准数据集上的大量实验证明了 Design-MLLM 的优势。