We propose Enginuity - the first open, large-scale, multi-domain engineering diagram dataset with comprehensive structural annotations designed for automated diagram parsing. By capturing hierarchical component relationships, connections, and semantic elements across diverse engineering domains, our proposed dataset would enable multimodal large language models to address critical downstream tasks including structured diagram parsing, cross-modal information retrieval, and AI-assisted engineering simulation. Enginuity would be transformative for AI for Scientific Discovery by enabling artificial intelligence systems to comprehend and manipulate the visual-structural knowledge embedded in engineering diagrams, breaking down a fundamental barrier that currently prevents AI from fully participating in scientific workflows where diagram interpretation, technical drawing analysis, and visual reasoning are essential for hypothesis generation, experimental design, and discovery.
翻译:我们提出Enginuity——首个为自动化图表解析设计的、具有全面结构化标注的开放大规模多领域工程图数据集。通过捕捉跨多个工程领域的层次化组件关系、连接关系和语义元素,我们提出的数据集将使多模态大语言模型能够处理关键的下游任务,包括结构化图表解析、跨模态信息检索和AI辅助工程仿真。Enginuity将通过使人工智能系统理解和处理工程图中蕴含的视觉-结构知识,为"科学发现人工智能"带来变革性突破,从而打破当前阻碍AI充分参与科学工作流程的根本障碍——在这些流程中,图表解读、技术图纸分析和视觉推理对于假设生成、实验设计和科学发现至关重要。