This paper proposes an approach of Ladder Bottom-up Convolutional Bidirectional Variational Autoencoder (LCBVAE) architecture for the encoder and decoder, which is trained on the image translation of the dotted Arabic expiration dates by reconstructing the Arabic dotted expiration dates into filled-in expiration dates. We employed a customized and adapted version of Convolutional Recurrent Neural Network CRNN model to meet our specific requirements and enhance its performance in our context, and then trained the custom CRNN model with the filled-in images from the year of 2019 to 2027 to extract the expiration dates and assess the model performance of LCBVAE on the expiration date recognition. The pipeline of (LCBVAE+CRNN) can be then integrated into an automated sorting systems for extracting the expiry dates and sorting the products accordingly during the manufacture stage. Additionally, it can overcome the manual entry of expiration dates that can be time-consuming and inefficient at the merchants. Due to the lack of the availability of the dotted Arabic expiration date images, we created an Arabic dot-matrix True Type Font (TTF) for the generation of the synthetic images. We trained the model with unrealistic synthetic dates of 60,000 images and performed the testing on a realistic synthetic date of 3000 images from the year of 2019 to 2027, represented as yyyy/mm/dd. In our study, we demonstrated the significance of latent bottleneck layer with improving the generalization when the size is increased up to 1024 in downstream transfer learning tasks as for image translation. The proposed approach achieved an accuracy of 97% on the image translation with using the LCBVAE architecture that can be generalized for any downstream learning tasks as for image translation and reconstruction.
翻译:本文提出一种采用阶梯式自底向上卷积双向变分自编码器(LCBVAE)架构的编码器-解码器方法,通过将阿拉伯语点状保质期重构为填充式保质期图像,实现点状阿拉伯语保质期的图像翻译训练。我们采用定制化改进的卷积循环神经网络(CRNN)模型以适应特定需求并提升其在本场景中的性能,随后使用2019年至2027年的填充式图像训练该定制CRNN模型,以提取保质期信息并评估LCBVAE在保质期识别任务中的性能。该(LCBVAE+CRNN)流程可集成至自动化分拣系统,在制造阶段提取保质期并据此对产品进行分类。此外,该方法能克服商户端人工录入保质期耗时低效的问题。针对阿拉伯语点状保质期图像数据稀缺的现状,我们创建了阿拉伯语点阵True Type字体(TTF)以生成合成图像。模型使用60,000张非真实合成日期图像进行训练,并在2019年至2027年间以yyyy/mm/dd格式表示的3,000张真实风格合成日期图像上进行测试。研究表明,在图像翻译等下游迁移学习任务中,潜在瓶颈层尺寸增至1024时能显著提升模型泛化能力。所提方法采用LCBVAE架构在图像翻译任务中达到97%的准确率,该架构可泛化适用于各类图像翻译与重构的下游学习任务。