Autonomous 3D indoor scene synthesis breaks down in non-convex rooms with tightly coupled spatial constraints. Data-driven generators lack topological priors for long-horizon planning, while iterative agents fragment semantics and become geometrically brittle. We present ZoneMaestro, a unified framework that shifts the paradigm from object-centric synthesis to Zone-Graph Orchestration. By internalizing a novel zone-based logic, ZoneMaestro translates high-level semantic intent into functional zones and topological constraints, enabling robust adaptation to diverse architectural forms. To support this, we construct Zone-Scene-10K, a large-scale dataset enriched with explicit Zone-Graph annotations. We further introduce an Alternating Alignment Strategy that cycles between reasoning internalization and Zone-Aware Group Relative Policy Optimization (Z-GRPO), effectively reconciling the tension between semantic richness and geometric validity without relying on external physics engines. To rigorously evaluate spatial intelligence beyond convex primitives, we formally define the task of Intricate Spatial Orchestration and release SCALE, a stress-test benchmark for irregular indoor scenarios with complex, dense spatial relations. Extensive experiments demonstrate that ZoneMaestro resolves the density-safety dichotomy, significantly outperforming state-of-the-art baselines in both structural coherence and intent adherence.
翻译:自主三维室内场景合成在具有紧密耦合空间约束的非凸房间中会失效。数据驱动生成器缺乏用于长程规划的先验拓扑知识,而迭代式智能体则会破坏语义连贯性并导致几何脆弱性。我们提出ZoneMaestro——一个将范式从以对象为中心的合成转向区域-图协调的统一框架。通过内化一种新颖的基于区域的逻辑,ZoneMaestro将高层语义意图转化为功能区域和拓扑约束,使其能够稳健适应多种建筑形态。为此,我们构建了Zone-Scene-10K——一个包含显式区域-图注释的大规模数据集。我们进一步提出一种交替对齐策略,该策略在推理内化与区域感知群体相对策略优化(Z-GRPO)之间循环切换,有效调和了语义丰富性与几何有效性之间的矛盾,且无需依赖外部物理引擎。为严格评估凸体原语之外的空间智能,我们正式定义了复杂空间协调任务,并发布了SCALE——一个针对具有复杂密集空间关系的不规则室内场景的压力测试基准。大量实验表明,ZoneMaestro解决了密度-安全性二分问题,在结构连贯性和意图遵从度方面均显著优于现有最优基线。