Generative Model-Driven Synthetic Training Image Generation: An Approach to Cognition in Rail Defect Detection

Recent advancements in cognitive computing, with the integration of deep learning techniques, have facilitated the development of intelligent cognitive systems (ICS). This is particularly beneficial in the context of rail defect detection, where the ICS would emulate human-like analysis of image data for defect patterns. Despite the success of Convolutional Neural Networks (CNN) in visual defect classification, the scarcity of large datasets for rail defect detection remains a challenge due to infrequent accident events that would result in defective parts and images. Contemporary researchers have addressed this data scarcity challenge by exploring rule-based and generative data augmentation models. Among these, Variational Autoencoder (VAE) models can generate realistic data without extensive baseline datasets for noise modeling. This study proposes a VAE-based synthetic image generation technique for rail defects, incorporating weight decay regularization and image reconstruction loss to prevent overfitting. The proposed method is applied to create a synthetic dataset for the Canadian Pacific Railway (CPR) with just 50 real samples across five classes. Remarkably, 500 synthetic samples are generated with a minimal reconstruction loss of 0.021. A Visual Transformer (ViT) model underwent fine-tuning using this synthetic CPR dataset, achieving high accuracy rates (98%-99%) in classifying the five defect classes. This research offers a promising solution to the data scarcity challenge in rail defect detection, showcasing the potential for robust ICS development in this domain.

翻译：近期认知计算与深度学习技术的融合进展，推动了智能认知系统（ICS）的发展。在铁路缺陷检测领域，ICS可模拟人类对图像数据的分析以识别缺陷模式，这具有显著优势。尽管卷积神经网络（CNN）在视觉缺陷分类中取得成功，但由于导致缺陷部件及图像的偶发事故频率较低，铁路缺陷检测领域仍面临大规模数据集匮乏的挑战。当代研究者通过探索基于规则与生成式的数据增强模型应对数据稀缺问题。其中，变分自编码器（VAE）模型无需大量基线数据集进行噪声建模即可生成逼真数据。本研究提出一种基于VAE的铁路缺陷合成图像生成技术，通过结合权重衰减正则化与图像重建损失防止过拟合。该方法仅需五个类别的50个真实样本，即可为加拿大太平洋铁路（CPR）创建合成数据集。值得关注的是，该方法生成了500个合成样本，重建损失低至0.021。利用该合成CPR数据集对视觉Transformer（ViT）模型进行微调后，其在五类缺陷分类任务中达到98%-99%的高准确率。本研究为铁路缺陷检测中的数据稀缺难题提供了有效解决方案，展示了该领域稳健ICS系统开发的潜在可能。

相关内容

Cognition

关注 4

Cognition：Cognition：International Journal of Cognitive Science Explanation：认知：国际认知科学杂志。 Publisher：Elsevier。 SIT： http://www.journals.elsevier.com/cognition/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日