Recent advancements in cognitive computing, with the integration of deep learning techniques, have facilitated the development of intelligent cognitive systems (ICS). This is particularly beneficial in the context of rail defect detection, where the ICS would emulate human-like analysis of image data for defect patterns. Despite the success of Convolutional Neural Networks (CNN) in visual defect classification, the scarcity of large datasets for rail defect detection remains a challenge due to infrequent accident events that would result in defective parts and images. Contemporary researchers have addressed this data scarcity challenge by exploring rule-based and generative data augmentation models. Among these, Variational Autoencoder (VAE) models can generate realistic data without extensive baseline datasets for noise modeling. This study proposes a VAE-based synthetic image generation technique for rail defects, incorporating weight decay regularization and image reconstruction loss to prevent overfitting. The proposed method is applied to create a synthetic dataset for the Canadian Pacific Railway (CPR) with just 50 real samples across five classes. Remarkably, 500 synthetic samples are generated with a minimal reconstruction loss of 0.021. A Visual Transformer (ViT) model underwent fine-tuning using this synthetic CPR dataset, achieving high accuracy rates (98%-99%) in classifying the five defect classes. This research offers a promising solution to the data scarcity challenge in rail defect detection, showcasing the potential for robust ICS development in this domain.
翻译:近期认知计算与深度学习技术的融合进展,推动了智能认知系统(ICS)的发展。在铁路缺陷检测领域,ICS可模拟人类对图像数据的分析以识别缺陷模式,这具有显著优势。尽管卷积神经网络(CNN)在视觉缺陷分类中取得成功,但由于导致缺陷部件及图像的偶发事故频率较低,铁路缺陷检测领域仍面临大规模数据集匮乏的挑战。当代研究者通过探索基于规则与生成式的数据增强模型应对数据稀缺问题。其中,变分自编码器(VAE)模型无需大量基线数据集进行噪声建模即可生成逼真数据。本研究提出一种基于VAE的铁路缺陷合成图像生成技术,通过结合权重衰减正则化与图像重建损失防止过拟合。该方法仅需五个类别的50个真实样本,即可为加拿大太平洋铁路(CPR)创建合成数据集。值得关注的是,该方法生成了500个合成样本,重建损失低至0.021。利用该合成CPR数据集对视觉Transformer(ViT)模型进行微调后,其在五类缺陷分类任务中达到98%-99%的高准确率。本研究为铁路缺陷检测中的数据稀缺难题提供了有效解决方案,展示了该领域稳健ICS系统开发的潜在可能。