In recent years, large-scale auto-regressive models have made significant progress in various tasks, such as text or video generation. However, the environmental impact of these models has been largely overlooked, with a lack of assessment and analysis of their carbon footprint. To address this gap, we introduce OpenCarbonEval, a unified framework for integrating large-scale models across diverse modalities to predict carbon emissions, which could provide AI service providers and users with a means to estimate emissions beforehand and help mitigate the environmental pressure associated with these models. In OpenCarbonEval, we propose a dynamic throughput modeling approach that could capture workload and hardware fluctuations in the training process for more precise emissions estimates. Our evaluation results demonstrate that OpenCarbonEval can more accurately predict training emissions than previous methods, and can be seamlessly applied to different modal tasks. Specifically, we show that OpenCarbonEval achieves superior performance in predicting carbon emissions for both visual models and language models. By promoting sustainable AI development and deployment, OpenCarbonEval can help reduce the environmental impact of large-scale models and contribute to a more environmentally responsible future for the AI community.
翻译:近年来,大规模自回归模型在文本或视频生成等多种任务中取得了显著进展。然而,这些模型的环境影响在很大程度上被忽视,缺乏对其碳足迹的评估与分析。为填补这一空白,我们提出了OpenCarbonEval,一个用于整合跨不同模态的大规模模型以预测碳排放的统一框架,可为AI服务提供商和用户提供事先估算排放量的手段,帮助缓解这些模型带来的环境压力。在OpenCarbonEval中,我们提出了一种动态吞吐量建模方法,能够捕捉训练过程中的工作负载和硬件波动,从而实现更精确的排放估算。我们的评估结果表明,OpenCarbonEval相比以往方法能更准确地预测训练排放,并可无缝应用于不同模态的任务。具体而言,我们证明OpenCarbonEval在预测视觉模型和语言模型的碳排放方面均表现出优越性能。通过促进可持续的AI开发与部署,OpenCarbonEval有助于减少大规模模型的环境影响,为AI社区迈向更环保的未来作出贡献。