Supervised machine learning algorithms play a crucial role in optical quality control within industrial production. These approaches require representative datasets for effective model training. However, while non-defective components are frequent, defective parts are rare in production, resulting in highly imbalanced datasets that adversely impact model performance. Existing strategies to address this challenge, such as specialized loss functions or traditional data augmentation techniques, have limitations, including the need for careful hyperparameter tuning or the alteration of only simple image features. Therefore, this work explores the potential of generative artificial intelligence (GenAI) as an alternative method for expanding limited datasets and enhancing supervised machine learning performance. Specifically, we investigate Stable Diffusion and CycleGAN as image generation models, focusing on the segmentation of combine harvester components in thermal images for subsequent defect detection. Our results demonstrate that dataset expansion using Stable Diffusion yields the most significant improvement, enhancing segmentation performance by 4.6 %, resulting in a Mean Intersection over Union (Mean IoU) of 84.6 %.
翻译:监督式机器学习算法在工业生产的光学质量控制中发挥着关键作用。这些方法需要具有代表性的数据集以实现有效的模型训练。然而,尽管无缺陷部件在生产中较为常见,但有缺陷的部件却十分罕见,这导致数据集高度不平衡,从而对模型性能产生不利影响。现有应对这一挑战的策略,如专用损失函数或传统数据增强技术,均存在局限性,包括需要仔细调整超参数或仅能改变简单的图像特征。因此,本研究探索了生成式人工智能作为一种替代方法,用于扩展有限数据集并提升监督式机器学习性能的潜力。具体而言,我们研究了Stable Diffusion和CycleGAN作为图像生成模型,重点关注热成像中联合收割机部件的分割,以便进行后续的缺陷检测。我们的结果表明,使用Stable Diffusion进行数据集扩展带来了最显著的性能提升,将分割性能提高了4.6%,最终实现了84.6%的平均交并比。