We introduce CADFS, a data-centric framework that enables large vision-language models to generate complex CAD design histories. Existing generative CAD systems are restricted to sketch-extrude operations due to simplified representations and limited datasets. We address this by introducing a FeatureScript-based representation and constructing a dataset of 450k real-world CAD models spanning 15 modeling operations. We obtain the dataset via a new pipeline that reconstructs clean, executable FeatureScript programs and provides multimodal annotations. Fine-tuning a VLM on this representation yields state-of-the-art results in text-conditioned CAD generation and image-based reconstruction, producing more accurate, diverse, and feature-rich designs than prior frameworks. Ablations show that each individual component of our framework, i.e., the FeatureScript representation, the extended operation set, and representation-aligned textual descriptions, significantly improves performance. Our framework substantially broadens the complexity and realism achievable in generative CAD. The CADFS framework and the new dataset are available at https://voyleg.github.io/cadfs/.
翻译:我们提出CADFS,这是一个以数据为中心的框架,能够使大型视觉语言模型生成复杂的CAD设计历史。现有生成式CAD系统受限于简化表示法和有限数据集,仅支持草图-拉伸操作。为解决此问题,我们引入基于FeatureScript的表示法,并构建了一个包含45万个真实世界CAD模型、涵盖15种建模操作的数据集。该数据集通过一条新流程获取:该流程能够重建干净且可执行的FeatureScript程序,并提供多模态注释。在此表示法上微调视觉语言模型,可在文本条件CAD生成和基于图像的重建任务中取得最优结果,产生比先前框架更准确、更多样化且特征更丰富的设计。消融实验表明,框架的每个独立组件(即FeatureScript表示法、扩展操作集以及与表示法对齐的文本描述)均能显著提升性能。我们的框架大幅拓宽了生成式CAD可实现的复杂性和真实感。CADFS框架及新数据集可通过https://voyleg.github.io/cadfs/ 获取。