CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM

This paper aims to design a unified Computer-Aided Design (CAD) generation system that can easily generate CAD models based on the user's inputs in the form of textual description, images, point clouds, or even a combination of them. Towards this goal, we introduce the CAD-MLLM, the first system capable of generating parametric CAD models conditioned on the multimodal input. Specifically, within the CAD-MLLM framework, we leverage the command sequences of CAD models and then employ advanced large language models (LLMs) to align the feature space across these diverse multi-modalities data and CAD models' vectorized representations. To facilitate the model training, we design a comprehensive data construction and annotation pipeline that equips each CAD model with corresponding multimodal data. Our resulting dataset, named Omni-CAD, is the first multimodal CAD dataset that contains textual description, multi-view images, points, and command sequence for each CAD model. It contains approximately 450K instances and their CAD construction sequences. To thoroughly evaluate the quality of our generated CAD models, we go beyond current evaluation metrics that focus on reconstruction quality by introducing additional metrics that assess topology quality and surface enclosure extent. Extensive experimental results demonstrate that CAD-MLLM significantly outperforms existing conditional generative methods and remains highly robust to noises and missing points. The project page and more visualizations can be found at: https://cad-mllm.github.io/

翻译：本文旨在设计一个统一的计算机辅助设计（CAD）生成系统，能够根据用户以文本描述、图像、点云或其组合形式提供的输入，便捷地生成CAD模型。为实现此目标，我们提出了CAD-MLLM，这是首个能够基于多模态输入生成参数化CAD模型的系统。具体而言，在CAD-MLLM框架内，我们利用CAD模型的命令序列，并采用先进的大语言模型（LLMs）来对齐这些多样化多模态数据与CAD模型向量化表示之间的特征空间。为促进模型训练，我们设计了一套全面的数据构建与标注流程，为每个CAD模型配备了相应的多模态数据。我们构建的数据集名为Omni-CAD，是首个为每个CAD模型包含文本描述、多视角图像、点云及命令序列的多模态CAD数据集。该数据集包含约45万个实例及其CAD构建序列。为全面评估所生成CAD模型的质量，我们超越了当前侧重于重建质量的评估指标，引入了额外指标来评估拓扑质量和表面闭合程度。大量实验结果表明，CAD-MLLM显著优于现有的条件生成方法，并对噪声和缺失点保持高度鲁棒性。项目页面及更多可视化结果可访问：https://cad-mllm.github.io/

相关内容

CAD

关注 3

《计算机辅助设计》是一份领先的国际期刊，为学术界和工业界提供有关计算机应用于设计的研究和发展的重要论文。计算机辅助设计邀请论文报告新的研究以及新颖或特别重要的应用，在广泛的主题中，跨越所有阶段的设计过程，从概念创造到制造超越。官网地址：http://dblp.uni-trier.de/db/journals/cad/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日