The process of training a deep neural network is characterized by significant time requirements and associated costs. Although researchers have made considerable progress in this area, further work is still required due to resource constraints. This study examines innovative approaches to expedite the training process of deep neural networks (DNN), with specific emphasis on three state-of-the-art models such as ResNet50, Vision Transformer (ViT), and EfficientNet. The research utilizes sophisticated methodologies, including Gradient Accumulation (GA), Automatic Mixed Precision (AMP), and Pin Memory (PM), in order to optimize performance and accelerate the training procedure. The study examines the effects of these methodologies on the DNN models discussed earlier, assessing their efficacy with regard to training rate and computational efficacy. The study showcases the efficacy of including GA as a strategic approach, resulting in a noteworthy decrease in the duration required for training. This enables the models to converge at a faster pace. The utilization of AMP enhances the speed of computations by taking advantage of the advantages offered by lower precision arithmetic while maintaining the correctness of the model. Furthermore, this study investigates the application of Pin Memory as a strategy to enhance the efficiency of data transmission between the central processing unit and the graphics processing unit, thereby offering a promising opportunity for enhancing overall performance. The experimental findings demonstrate that the combination of these sophisticated methodologies significantly accelerates the training of DNNs, offering vital insights for experts seeking to improve the effectiveness of deep learning processes.
翻译:深度神经网络的训练过程以显著的时间需求和相关成本为特征。尽管研究人员在该领域已取得显著进展,但由于资源限制,仍需进一步研究。本研究探讨了加速深度神经网络(DNN)训练过程的创新方法,特别聚焦于三种最先进的模型,如ResNet50、Vision Transformer(ViT)和EfficientNet。研究采用了包括梯度累积(GA)、自动混合精度(AMP)和锁定内存(PM)在内的先进方法,以优化性能并加速训练过程。研究考察了这些方法对前述DNN模型的影响,从训练速率和计算效率方面评估了其有效性。研究展示了将GA作为一种策略性方法纳入训练过程的有效性,从而显著缩短了训练所需时间,使模型能够更快收敛。AMP的利用通过利用低精度算术的优势,在保持模型正确性的同时,提高了计算速度。此外,本研究还探讨了将锁定内存作为增强中央处理器与图形处理器之间数据传输效率的策略,从而为提升整体性能提供了有前景的机遇。实验结果表明,这些先进方法的组合显著加速了DNN的训练,为寻求提高深度学习过程效率的专家提供了重要见解。