The carbon footprint associated with large language models (LLMs) is a significant concern, encompassing emissions from their training, inference, experimentation, and storage processes, including operational and embodied carbon emissions. An essential aspect is accurately estimating the carbon impact of emerging LLMs even before their training, which heavily relies on GPU usage. Existing studies have reported the carbon footprint of LLM training, but only one tool, mlco2, can predict the carbon footprint of new neural networks prior to physical training. However, mlco2 has several serious limitations. It cannot extend its estimation to dense or mixture-of-experts (MoE) LLMs, disregards critical architectural parameters, focuses solely on GPUs, and cannot model embodied carbon footprints. Addressing these gaps, we introduce \textit{\carb}, an end-to-end carbon footprint projection model designed for both dense and MoE LLMs. Compared to mlco2, \carb~significantly enhances the accuracy of carbon footprint estimations for various LLMs. The source code is released at \url{https://github.com/SotaroKaneda/MLCarbon}.
翻译:大型语言模型(LLM)相关的碳足迹是一个重要议题,涵盖其训练、推理、实验及存储过程中产生的排放,包括运营碳与隐含碳。关键挑战在于,即使在新LLM训练之前(高度依赖GPU使用),也能准确估算其碳影响。现有研究已报道了LLM训练的碳足迹,但仅有一个工具mlco2能够在物理训练前预测新神经网络的碳足迹。然而,mlco2存在若干严重局限性:它无法扩展估算至密集模型或混合专家(MoE)LLM,忽略关键架构参数,仅关注GPU,且无法对隐含碳足迹建模。针对这些不足,我们提出了《carb》,一个面向密集与MoE LLM的端到端碳足迹预测模型。与mlco2相比,《carb》显著提升了多种LLM碳足迹估算的准确性。源代码已发布在\url{https://github.com/SotaroKaneda/MLCarbon}。