The costs of training frontier AI models have grown dramatically in recent years, but there is limited public data on the magnitude and growth of these expenses. This paper develops a detailed cost model to address this gap, estimating training costs using three approaches that account for hardware, energy, cloud rental, and staff expenses. The analysis reveals that the amortized cost to train the most compute-intensive models has grown precipitously at a rate of 2.4x per year since 2016 (95% CI: 2.0x to 3.1x). For key frontier models, such as GPT-4 and Gemini, the most significant expenses are AI accelerator chips and staff costs, each costing tens of millions of dollars. Other notable costs include server components (15-22%), cluster-level interconnect (9-13%), and energy consumption (2-6%). If the trend of growing development costs continues, the largest training runs will cost more than a billion dollars by 2027, meaning that only the most well-funded organizations will be able to finance frontier AI models.
翻译:近年来,前沿AI模型的训练成本急剧增长,但关于这些费用的规模及增长情况的公开数据却十分有限。本文构建了一个详细的成本模型以填补这一空白,通过三种方法估算训练成本,这些方法综合考虑了硬件、能源、云租赁及人员费用。分析表明,自2016年以来,训练计算最密集模型的摊销成本以每年2.4倍的速度急剧增长(95%置信区间:2.0倍至3.1倍)。对于关键的前沿模型,如GPT-4和Gemini,最主要的开支是AI加速器芯片和人员成本,每项都高达数千万美元。其他显著成本包括服务器组件(15-22%)、集群级互连(9-13%)以及能源消耗(2-6%)。如果开发成本持续增长的趋势延续,到2027年,最大规模的训练运行成本将超过十亿美元,这意味着只有资金最雄厚的组织才有能力资助前沿AI模型的开发。