Automatic feature recognition (AFR) is essential for transforming design knowledge into actionable manufacturing information. Traditional AFR methods, which rely on predefined geometric rules and large datasets, are often time-consuming and lack generalizability across various manufacturing features. To address these challenges, this study investigates vision-language models (VLMs) for automating the recognition of a wide range of manufacturing features in CAD designs without the need for extensive training datasets or predefined rules. Instead, prompt engineering techniques, such as multi-view query images, few-shot learning, sequential reasoning, and chain-of-thought, are applied to enable recognition. The approach is evaluated on a newly developed CAD dataset containing designs of varying complexity relevant to machining, additive manufacturing, sheet metal forming, molding, and casting. Five VLMs, including three closed-source models (GPT-4o, Claude-3.5-Sonnet, and Claude-3.0-Opus) and two open-source models (LLava and MiniCPM), are evaluated on this dataset with ground truth features labelled by experts. Key metrics include feature quantity accuracy, feature name matching accuracy, hallucination rate, and mean absolute error (MAE). Results show that Claude-3.5-Sonnet achieves the highest feature quantity accuracy (74%) and name-matching accuracy (75%) with the lowest MAE (3.2), while GPT-4o records the lowest hallucination rate (8%). In contrast, open-source models have higher hallucination rates (>30%) and lower accuracies (<40%). This study demonstrates the potential of VLMs to automate feature recognition in CAD designs within diverse manufacturing scenarios.
翻译:自动特征识别(AFR)对于将设计知识转化为可操作的制造信息至关重要。传统的AFR方法依赖于预定义的几何规则和大型数据集,通常耗时且缺乏跨不同制造特征的泛化能力。为应对这些挑战,本研究探索利用视觉-语言模型(VLMs)实现CAD设计中多种制造特征的自动识别,无需大量训练数据集或预定义规则。通过应用提示工程技术,如多视角查询图像、少样本学习、顺序推理和思维链,来实现特征识别。该方法在一个新开发的CAD数据集上进行评估,该数据集包含与机械加工、增材制造、钣金成形、注塑成型和铸造相关的不同复杂度设计。在该数据集上评估了五个VLM模型,包括三个闭源模型(GPT-4o、Claude-3.5-Sonnet和Claude-3.0-Opus)和两个开源模型(LLava和MiniCPM),所有特征真值均由专家标注。关键评估指标包括特征数量准确率、特征名称匹配准确率、幻觉率和平均绝对误差(MAE)。结果显示,Claude-3.5-Sonnet实现了最高的特征数量准确率(74%)和名称匹配准确率(75%)以及最低的MAE(3.2),而GPT-4o的幻觉率最低(8%)。相比之下,开源模型的幻觉率较高(>30%)且准确率较低(<40%)。本研究证明了VLM在多样化制造场景中实现CAD设计特征自动识别的潜力。