Automatic optical inspection (AOI) plays a pivotal role in the manufacturing process, predominantly leveraging high-resolution imaging instruments for scanning purposes. It detects anomalies by analyzing image textures or patterns, making it an essential tool in industrial manufacturing and quality control. Despite its importance, the deployment of models for AOI often faces challenges. These include limited sample sizes, which hinder effective feature learning, variations among source domains, and sensitivities to changes in lighting and camera positions during imaging. These factors collectively compromise the accuracy of model predictions. Traditional AOI often fails to capitalize on the rich mechanism-parameter information from machines or inside images, including statistical parameters, which typically benefit AOI classification. To address this, we introduce an external modality-guided data mining framework, primarily rooted in optical character recognition (OCR), to extract statistical features from images as a second modality to enhance performance, termed OANet (Ocr-Aoi-Net). A key aspect of our approach is the alignment of external modality features, extracted using a single modality-aware model, with image features encoded by a convolutional neural network. This synergy enables a more refined fusion of semantic representations from different modalities. We further introduce feature refinement and a gating function in our OANet to optimize the combination of these features, enhancing inference and decision-making capabilities. Experimental outcomes show that our methodology considerably boosts the recall rate of the defect detection model and maintains high robustness even in challenging scenarios.
翻译:自动光学检测(AOI)在制造过程中发挥着关键作用,主要利用高分辨率成像仪器进行扫描。它通过分析图像纹理或模式来检测异常,因此成为工业制造和质量控制中的关键工具。尽管其重要性不言而喻,但用于AOI的模型部署常常面临挑战,包括样本数量有限(阻碍有效的特征学习)、源域之间的差异,以及对成像过程中光照变化和相机位置变化的敏感性。这些因素共同影响了模型预测的准确性。传统的AOI往往未能充分利用来自机器或图像内部的丰富机制-参数信息(包括统计参数),而这些信息通常有助于AOI的分类。为解决这一问题,我们引入了一种外部模态引导的数据挖掘框架,该框架主要基于光学字符识别(OCR),从图像中提取统计特征作为第二模态以提升性能,称为OANet(Ocr-Aoi-Net)。我们方法的一个关键方面是,将使用单模态感知模型提取的外部模态特征与卷积神经网络编码的图像特征进行对齐。这种协同作用实现了不同模态语义表示的更精细融合。我们进一步在OANet中引入了特征精炼和门控函数,以优化这些特征的组合,增强推理和决策能力。实验结果表明,我们的方法显著提升了缺陷检测模型的召回率,即使在具有挑战性的场景中也保持了较高的鲁棒性。