The steady growth of artificial intelligence (AI) has accelerated in the recent years, facilitated by the development of sophisticated models such as large language models and foundation models. Ensuring robust and reliable power infrastructures is fundamental to take advantage of the full potential of AI. However, AI data centres are extremely hungry for power, putting the problem of their power management in the spotlight, especially with respect to their impact on environment and sustainable development. In this work, we investigate the capacity and limits of solutions based on an innovative approach for the power management of AI data centres, i.e., making part of the input power as dynamic as the power used for data-computing functions. The performance of passive and active devices are quantified and compared in terms of computational gain, energy efficiency, reduction of capital expenditure, and management costs by analysing power trends from multiple data platforms worldwide. This strategy, which identifies a paradigm shift in the AI data centre power management, has the potential to strongly improve the sustainability of AI hyperscalers, enhancing their footprint on environmental, financial, and societal fields.
翻译:近年来,随着大型语言模型和基础模型等复杂模型的发展,人工智能(AI)呈现稳步增长态势。构建稳健可靠的电力基础设施是充分发挥AI潜力的基础。然而,AI数据中心对电力需求极高,其电力管理问题——特别是对环境与可持续发展的影响——已成为关注焦点。本研究探讨了一种基于创新方法的AI数据中心电力管理解决方案的潜力与局限:该方法使部分输入功率能够像数据计算功能所用功率一样动态变化。通过分析全球多数据平台的功率趋势,我们量化比较了被动式与主动式设备在计算增益、能效、资本支出削减及管理成本等方面的性能。这一策略标志着AI数据中心电力管理的范式转变,有望显著提升超大规模AI系统的可持续性,并改善其在环境、经济和社会领域的综合效益。