Modern AI practices all strive towards the same goal: better results. In the context of deep learning, the term "results" often refers to the achieved accuracy on a competitive problem set. In this paper, we adopt an idea from the emerging field of Green AI to consider energy consumption as a metric of equal importance to accuracy and to reduce any irrelevant tasks or energy usage. We examine the training stage of the deep learning pipeline from a sustainability perspective, through the study of hyperparameter tuning strategies and the model complexity, two factors vastly impacting the overall pipeline's energy consumption. First, we investigate the effectiveness of grid search, random search and Bayesian optimisation during hyperparameter tuning, and we find that Bayesian optimisation significantly dominates the other strategies. Furthermore, we analyse the architecture of convolutional neural networks with the energy consumption of three prominent layer types: convolutional, linear and ReLU layers. The results show that convolutional layers are the most computationally expensive by a strong margin. Additionally, we observe diminishing returns in accuracy for more energy-hungry models. The overall energy consumption of training can be halved by reducing the network complexity. In conclusion, we highlight innovative and promising energy-efficient practices for training deep learning models. To expand the application of Green AI, we advocate for a shift in the design of deep learning models, by considering the trade-off between energy efficiency and accuracy.
翻译:现代人工智能实践都追求同一个目标:更好的结果。在深度学习背景下,“结果”通常指在竞争性问题集上达到的准确率。本文采纳新兴绿色AI领域的理念,将能耗视为与准确率同等重要的指标,并致力于减少任何无关任务或能源消耗。我们从可持续性角度审视深度学习流程的训练阶段,通过研究超参数调优策略和模型复杂度——这两个对整体流程能耗影响巨大的因素展开分析。首先,我们探究网格搜索、随机搜索和贝叶斯优化在超参数调优中的有效性,发现贝叶斯优化显著优于其他策略。其次,我们基于卷积层、线性层和ReLU层三种典型层类型的能耗分析卷积神经网络架构。结果表明卷积层的计算成本远高于其他层。此外,我们观察到随着模型能耗增加,准确率提升出现边际递减效应。通过降低网络复杂度,训练总能耗可减少一半。最后,我们强调训练深度学习模型时具有创新性且前景广阔的节能实践。为推广绿色AI应用,我们倡导通过权衡能效与准确率来转变深度学习模型的设计思路。