Prototyping complex computer-aided design (CAD) models in modern softwares can be very time-consuming. This is due to the lack of intelligent systems that can quickly generate simpler intermediate parts. We propose Text2CAD, the first AI framework for generating text-to-parametric CAD models using designer-friendly instructions for all skill levels. Furthermore, we introduce a data annotation pipeline for generating text prompts based on natural language instructions for the DeepCAD dataset using Mistral and LLaVA-NeXT. The dataset contains $\sim170$K models and $\sim660$K text annotations, from abstract CAD descriptions (e.g., generate two concentric cylinders) to detailed specifications (e.g., draw two circles with center $(x,y)$ and radius $r_{1}$, $r_{2}$, and extrude along the normal by $d$...). Within the Text2CAD framework, we propose an end-to-end transformer-based auto-regressive network to generate parametric CAD models from input texts. We evaluate the performance of our model through a mixture of metrics, including visual quality, parametric precision, and geometrical accuracy. Our proposed framework shows great potential in AI-aided design applications. Our source code and annotations will be publicly available.
翻译:在现代软件中构建复杂的计算机辅助设计(CAD)模型原型可能非常耗时,这主要是由于缺乏能够快速生成简单中间部件的智能系统。我们提出了Text2CAD,这是首个利用面向所有技能水平的设计师友好指令、实现文本到参数化CAD模型生成的AI框架。此外,我们引入了一个数据标注流程,该流程基于Mistral和LLaVA-NeXT为DeepCAD数据集生成符合自然语言指令的文本提示。该数据集包含约17万个模型和约66万条文本标注,涵盖从抽象CAD描述(例如:生成两个同心圆柱体)到详细规范(例如:以$(x,y)$为圆心、$r_{1}$和$r_{2}$为半径绘制两个圆,并沿法向挤出距离$d$……)。在Text2CAD框架中,我们提出了一种基于Transformer的端到端自回归网络,用于根据输入文本生成参数化CAD模型。我们通过视觉质量、参数精度和几何准确性等多维度指标评估模型性能。所提出的框架在AI辅助设计应用中展现出巨大潜力。我们的源代码与标注数据将公开提供。