Recent years have seen a surge in the popularity of commercial AI products based on generative, multi-purpose AI systems promising a unified approach to building machine learning (ML) models into technology. However, this ambition of "generality" comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit. In this work, we propose the first systematic comparison of the ongoing inference cost of various categories of ML systems, covering both task-specific (i.e. finetuned models that carry out a single task) and `general-purpose' models, (i.e. those trained for multiple tasks). We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models. We find that multi-purpose, generative architectures are orders of magnitude more expensive than task-specific systems for a variety of tasks, even when controlling for the number of model parameters. We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions. All the data from our study can be accessed via an interactive demo to carry out further exploration and analysis.
翻译:近年来,基于生成式、多用途人工智能系统的商业AI产品日益流行,这类系统承诺为将机器学习(ML)模型融入技术提供统一方案。然而,这种"通用性"的野心带来了高昂的环境代价——这些系统需要的能量及其产生的碳排放量巨大。本研究首次系统比较了各类机器学习系统的实际推理成本,涵盖任务专用型(即执行单一任务的微调模型)与"通用型"模型(即接受多任务训练的模型)。我们以这些模型在代表性基准数据集上执行1000次推理所需的能量和碳排放量来衡量部署成本。研究发现,即便在控制模型参数数量的情况下,多用途生成式架构在各类任务中的成本仍比任务专用系统高出数个数量级。最后我们围绕当前部署多用途生成式机器学习系统的趋势展开讨论,并警示应更审慎地权衡其效用与日益增长的能耗及排放成本。本研究所有数据均可通过交互式演示平台获取,以供进一步探索分析。