参数化控制网络：面向精确工程设计综合的基础模型多模态控制方法 (Parametric-ControlNet: Multimodal Control in Foundation Models for Precise Engineering Design Synthesis)

This paper introduces a generative model designed for multimodal control over text-to-image foundation generative AI models such as Stable Diffusion, specifically tailored for engineering design synthesis. Our model proposes parametric, image, and text control modalities to enhance design precision and diversity. Firstly, it handles both partial and complete parametric inputs using a diffusion model that acts as a design autocomplete co-pilot, coupled with a parametric encoder to process the information. Secondly, the model utilizes assembly graphs to systematically assemble input component images, which are then processed through a component encoder to capture essential visual data. Thirdly, textual descriptions are integrated via CLIP encoding, ensuring a comprehensive interpretation of design intent. These diverse inputs are synthesized through a multimodal fusion technique, creating a joint embedding that acts as the input to a module inspired by ControlNet. This integration allows the model to apply robust multimodal control to foundation models, facilitating the generation of complex and precise engineering designs. This approach broadens the capabilities of AI-driven design tools and demonstrates significant advancements in precise control based on diverse data modalities for enhanced design generation.

翻译：本文提出一种生成模型，专为对Stable Diffusion等文本到图像基础生成式人工智能模型实施多模态控制而设计，特别适用于工程设计综合任务。该模型通过参数化、图像与文本三种控制模态来提升设计精度与多样性。首先，模型采用扩散模型处理部分或完整的参数化输入，该扩散模型充当设计自动补全的协同辅助工具，并结合参数化编码器处理输入信息。其次，模型利用装配图系统化整合输入组件图像，通过组件编码器提取关键视觉特征。第三，通过CLIP编码器融合文本描述，确保对设计意图的全面解析。这些异构输入通过多模态融合技术进行综合处理，生成联合嵌入向量作为受ControlNet启发的控制模块的输入。该集成方案使模型能够对基础模型实施鲁棒的多模态控制，从而生成复杂且精确的工程设计方案。本方法拓展了人工智能驱动设计工具的能力边界，在基于多源数据模态的精确控制方面实现了显著进展，有力推动了设计生成质量的提升。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日