Computer-Aided Design (CAD) delivers rapid, editable modeling for engineering and manufacturing. Recent AI progress now makes full automation feasible for various CAD tasks. However, progress is bottlenecked by data: public corpora mostly contain sketch-extrude sequences, lack complex operations, multi-operation composition and design intent, and thus hinder effective fine-tuning. Attempts to bypass this with frozen VLMs often yield simple or invalid programs due to limited 3D grounding in current foundation models. We present CADEvolve, an evolution-based pipeline and dataset that starts from simple primitives and, via VLM-guided edits and validations, incrementally grows CAD programs toward industrial-grade complexity. The result is 8k complex parts expressed as executable CadQuery parametric generators. After multi-stage post-processing and augmentation, we obtain a unified dataset of 1.3m scripts paired with rendered geometry and exercising the full CadQuery operation set. A VLM fine-tuned on CADEvolve achieves state-of-the-art results on the Image2CAD task across the DeepCAD, Fusion 360, and MCB benchmarks.
翻译:计算机辅助设计(CAD)为工程与制造领域提供了快速、可编辑的建模手段。近年来人工智能的进展使得多种CAD任务实现全自动化成为可能。然而,数据瓶颈制约了进一步发展:公开数据集大多仅包含草图拉伸序列,缺乏复杂操作、多操作组合及设计意图,从而阻碍了有效的模型微调。当前基础模型对三维几何的理解有限,直接使用冻结的视觉语言模型往往只能生成简单或无效的程序。本文提出CADEvolve——一种基于演化的流程与数据集,该方法从简单几何基元出发,通过视觉语言模型引导的编辑与验证,逐步生成具有工业级复杂度的CAD程序。最终得到8000个以可执行CadQuery参数化生成器表达的复杂零件。经过多阶段后处理与数据增强,我们构建了包含130万脚本的统一数据集,每个脚本均与渲染几何体配对,并完整覆盖CadQuery操作集。基于CADEvolve微调的视觉语言模型在DeepCAD、Fusion 360和MCB基准测试的Image2CAD任务中均取得了最先进的性能。