Standardized Methods and Recommendations for Green Federated Learning

Federated learning (FL) enables collaborative model training over privacy-sensitive, distributed data, but its environmental impact is difficult to compare across studies due to inconsistent measurement boundaries and heterogeneous reporting. We present a practical carbon-accounting methodology for FL CO2e tracking using NVIDIA NVFlare and CodeCarbon for explicit, phase-aware tasks (initialization, per-round training, evaluation, and idle/coordination). To capture non-compute effects, we additionally estimate communication emissions from transmitted model-update sizes under a network-configurable energy model. We validate the proposed approach on two representative workloads: CIFAR-10 image classification and retinal optic disk segmentation. In CIFAR-10, controlled client-efficiency scenarios show that system-level slowdowns and coordination effects can contribute meaningfully to carbon footprint under an otherwise fixed FL protocol, increasing total CO2e by 8.34x (medium) and 21.73x (low) relative to the high-efficiency baseline. In retinal segmentation, swapping GPU tiers (H100 vs.\ V100) yields a consistent 1.7x runtime gap (290 vs. 503 minutes) while producing non-uniform changes in total energy and CO2e across sites, underscoring the need for per-site and per-round reporting. Overall, our results support a standardized carbon accounting method that acts as a prerequisite for reproducible 'green' FL evaluation. Our code is available at https://github.com/Pediatric-Accelerated-Intelligence-Lab/carbon_footprint.

翻译：联邦学习（FL）能够在保护隐私的分布式数据上进行协作式模型训练，但由于测量边界不一致和报告方式异构，其环境影响难以在不同研究间进行比较。本文提出一种实用的联邦学习二氧化碳当量（CO2e）追踪碳核算方法，该方法利用NVIDIA NVFlare和CodeCarbon工具，针对显式的阶段感知任务（初始化、每轮训练、评估及空闲/协调）进行计量。为捕捉非计算效应，我们基于可配置网络能耗模型，通过传输的模型更新规模额外估算了通信排放量。我们在两个代表性工作负载上验证了所提方法：CIFAR-10图像分类和视网膜视盘分割。在CIFAR-10实验中，受控的客户端效率场景表明，在固定联邦学习协议下，系统级减速和协调效应可能对碳足迹产生显著影响，相对于高效率基线，中效率和低效率场景的总CO2e分别增加8.34倍和21.73倍。在视网膜分割任务中，切换GPU层级（H100与V100）产生1.7倍的稳定运行时间差距（290分钟 vs. 503分钟），但各站点的总能耗和CO2e变化并不均匀，这凸显了按站点和按轮次报告的必要性。总体而言，我们的研究结果支持建立标准化的碳核算方法，作为可复现“绿色”联邦学习评估的前提条件。代码已开源：https://github.com/Pediatric-Accelerated-Intelligence-Lab/carbon_footprint。

相关内容

联邦学习

关注 200

联邦学习（Federated Learning）是一种新兴的人工智能基础技术，在 2016 年由谷歌最先提出，原本用于解决安卓手机终端用户在本地更新模型的问题，其设计目标是在保障大数据交换时的信息安全、保护终端数据和个人数据隐私、保证合法合规的前提下，在多参与方或多计算结点之间开展高效率的机器学习。其中，联邦学习可使用的机器学习算法不局限于神经网络，还包括随机森林等重要算法。联邦学习有望成为下一代人工智能协同算法和协作网络的基础。

《联邦学习在网络安全中的应用：性能、鲁棒性与对抗性威胁》2025最新145页

专知会员服务

20+阅读 · 2025年9月18日

【剑桥大学博士论文】联邦学习效率原则研究

专知会员服务

13+阅读 · 2025年9月6日

【剑桥大学博士论文】联邦自监督学习，141页pdf

专知会员服务

19+阅读 · 2024年6月15日