Machine learning has driven an exponential increase in computational demand, leading to massive data centers that consume significant amounts of energy and contribute to climate change. This makes sustainable data center control a priority. In this paper, we introduce SustainDC, a set of Python environments for benchmarking multi-agent reinforcement learning (MARL) algorithms for data centers (DC). SustainDC supports custom DC configurations and tasks such as workload scheduling, cooling optimization, and auxiliary battery management, with multiple agents managing these operations while accounting for the effects of each other. We evaluate various MARL algorithms on SustainDC, showing their performance across diverse DC designs, locations, weather conditions, grid carbon intensity, and workload requirements. Our results highlight significant opportunities for improvement of data center operations using MARL algorithms. Given the increasing use of DC due to AI, SustainDC provides a crucial platform for the development and benchmarking of advanced algorithms essential for achieving sustainable computing and addressing other heterogeneous real-world challenges.
翻译:机器学习推动了计算需求的指数级增长,导致大规模数据中心消耗大量能源并加剧气候变化。这使得可持续数据中心控制成为一项优先任务。本文介绍了SustainDC——一套用于数据中心多智能体强化学习算法基准测试的Python环境。SustainDC支持自定义数据中心配置及任务(如工作负载调度、冷却优化与辅助电池管理),多个智能体在考虑彼此影响的同时管理这些操作。我们在SustainDC上评估了多种多智能体强化学习算法,展示了它们在不同数据中心设计、地理位置、天气条件、电网碳强度及工作负载要求下的性能表现。研究结果突显了利用多智能体强化学习算法优化数据中心运营的巨大潜力。鉴于人工智能推动的数据中心使用量持续增长,SustainDC为开发与测试先进算法提供了关键平台,这些算法对实现可持续计算及应对其他异构现实世界挑战至关重要。