This work examines the effects of variations in machine learning training regimes and learning paradigms on the corresponding energy consumption. While increasing data availability and innovation in high-performance hardware fuels the training of sophisticated models, it also supports the fading perception of energy consumption and carbon emission. Therefore, the goal of this work is to create awareness about the energy impact of general training parameters and processes, from learning rate over batch size to knowledge transfer. Multiple setups with different hyperparameter initializations are evaluated on two different hardware configurations to obtain meaningful results. Experiments on pretraining and multitask training are conducted on top of the baseline results to determine their potential towards sustainable machine learning.
翻译:本研究探讨了机器学习训练机制与学习范式的变化对相应能耗的影响。尽管数据可用性的提升和高性能硬件创新推动了复杂模型的训练,但也加剧了对能耗与碳排放问题的日益忽视。因此,本研究旨在提升对通用训练参数与流程(从学习率、批次规模到知识迁移)能耗影响的认识。通过在两种不同硬件配置下评估多组具有不同超参数初始化的设置,我们获得了有意义的结果。在基线结果基础上,我们还进行了预训练与多任务训练实验,以确定其实现可持续机器学习的潜力。