The rapid advancement of Generative Artificial Intelligence (GenAI) across diverse sectors raises significant environmental concerns, notably the carbon emissions from their cloud and high performance computing (HPC) infrastructure. This paper presents Sprout, an innovative framework designed to address these concerns by reducing the carbon footprint of generative Large Language Model (LLM) inference services. Sprout leverages the innovative concept of "generation directives" to guide the autoregressive generation process, thereby enhancing carbon efficiency. Our proposed method meticulously balances the need for ecological sustainability with the demand for high-quality generation outcomes. Employing a directive optimizer for the strategic assignment of generation directives to user prompts and an original offline quality evaluator, Sprout demonstrates a significant reduction in carbon emissions by over 40% in real-world evaluations using the Llama2 LLM and global electricity grid data. This research marks a critical step toward aligning AI technology with sustainable practices, highlighting the potential for mitigating environmental impacts in the rapidly expanding domain of generative artificial intelligence.
翻译:生成式人工智能(GenAI)在多个领域的快速发展引发了显著的环境问题,尤其是其云端和高性能计算(HPC)基础设施产生的碳排放。本文提出Sprout这一创新框架,旨在通过降低生成式大语言模型(LLM)推理服务的碳足迹来应对这些问题。Sprout利用“生成指令”这一创新概念引导自回归生成过程,从而提升碳效率。我们提出的方法精心平衡了生态可持续性的需求与高质量生成成果的要求。通过采用指令优化器对用户提示进行生成指令的战略性分配,并借助原创离线质量评估器,Sprout在使用Llama2大语言模型和全球电网数据的实际评估中实现了碳排放减少超过40%。这项研究标志着人工智能技术与可持续发展实践对齐的关键一步,凸显了在快速扩张的生成式人工智能领域减轻环境影响的潜力。