This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines. For Discriminative models, we examine various architectures and hyperparameters during training and inference and identify energy-efficient practices. For Generative AI, Large Language Models (LLMs) are assessed, focusing primarily on energy consumption across different model sizes and varying service requests. Our study employs software-based power measurements, ensuring ease of replication across diverse configurations, models, and datasets. We analyse multiple models and hardware setups to uncover correlations among various metrics, identifying key contributors to energy consumption. The results indicate that for Discriminative models, optimising architectures, hyperparameters, and hardware can significantly reduce energy consumption without sacrificing performance. For LLMs, energy efficiency depends on balancing model size, reasoning complexity, and request-handling capacity, as larger models do not necessarily consume more energy when utilisation remains low. This analysis provides practical guidelines for designing green and sustainable ML operations, emphasising energy consumption and carbon footprint reductions while maintaining performance. This paper can serve as a benchmark for accurately estimating total energy use across different types of AI models.
翻译:本研究对现实世界MLOps流程中判别式与生成式人工智能模型的能耗进行了实证调查。针对判别式模型,我们考察了训练与推理阶段的不同架构与超参数,并识别出节能实践。对于生成式人工智能,我们评估了大型语言模型(LLMs),主要关注不同模型规模与多样化服务请求下的能耗。本研究采用基于软件的功耗测量方法,确保其在不同配置、模型与数据集间易于复现。通过分析多种模型与硬件设置,我们揭示了各项指标间的关联性,并识别出影响能耗的关键因素。结果表明:对于判别式模型,优化架构、超参数与硬件可在不牺牲性能的前提下显著降低能耗;对于LLMs,能效取决于模型规模、推理复杂度与请求处理能力之间的平衡,因为当利用率保持较低时,较大规模的模型未必消耗更多能源。本分析为设计绿色可持续的机器学习操作提供了实用指南,强调在保持性能的同时降低能耗与碳足迹。本文可作为准确估算不同类型人工智能模型总能耗的基准。