FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models

Recently, there is a growing interest in creating computer-aided design (CAD) models based on user intent, known as controllable CAD generation. Existing work offers limited controllability and needs separate models for different types of control, reducing efficiency and practicality. To achieve controllable generation across all CAD construction hierarchies, such as sketch-extrusion, extrusion, sketch, face, loop and curve, we propose FlexCAD, a unified model by fine-tuning large language models (LLMs). First, to enhance comprehension by LLMs, we represent a CAD model as a structured text by abstracting each hierarchy as a sequence of text tokens. Second, to address various controllable generation tasks in a unified model, we introduce a hierarchy-aware masking strategy. Specifically, during training, we mask a hierarchy-aware field in the CAD text with a mask token. This field, composed of a sequence of tokens, can be set flexibly to represent various hierarchies. Subsequently, we ask LLMs to predict this masked field. During inference, the user intent is converted into a CAD text with a mask token replacing the part the user wants to modify, which is then fed into FlexCAD to generate new CAD models. Comprehensive experiments on public dataset demonstrate the effectiveness of FlexCAD in both generation quality and controllability. Code will be available at https://github.com/microsoft/CADGeneration/FlexCAD.

翻译：近年来，基于用户意图创建计算机辅助设计（CAD）模型（即可控CAD生成）的研究日益受到关注。现有工作提供的可控性有限，且需要针对不同类型的控制分别训练模型，降低了效率与实用性。为实现跨所有CAD构建层次（如草图-拉伸、拉伸、草图、面、环与曲线）的可控生成，我们提出FlexCAD——一种通过微调大型语言模型（LLMs）实现的统一模型。首先，为提升LLMs的理解能力，我们将CAD模型表示为结构化文本，将每个构建层次抽象为一系列文本标记。其次，为在统一模型中处理多种可控生成任务，我们提出一种层次感知掩码策略。具体而言，在训练过程中，我们将CAD文本中一个层次感知字段用掩码标记进行遮盖。该字段由一系列标记组成，可灵活设置为代表不同构建层次。随后，我们要求LLMs预测这个被掩码的字段。在推理阶段，用户意图被转换为包含掩码标记的CAD文本（掩码标记替换用户希望修改的部分），随后输入FlexCAD以生成新的CAD模型。在公开数据集上的综合实验证明了FlexCAD在生成质量与可控性方面的有效性。代码将在 https://github.com/microsoft/CADGeneration/FlexCAD 发布。

相关内容

CAD

关注 3

《计算机辅助设计》是一份领先的国际期刊，为学术界和工业界提供有关计算机应用于设计的研究和发展的重要论文。计算机辅助设计邀请论文报告新的研究以及新颖或特别重要的应用，在广泛的主题中，跨越所有阶段的设计过程，从概念创造到制造超越。官网地址：http://dblp.uni-trier.de/db/journals/cad/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日