Despite impressive results, deep learning-based technologies also raise severe privacy and environmental concerns induced by the training procedure often conducted in data centers. In response, alternatives to centralized training such as Federated Learning (FL) have emerged. Perhaps unexpectedly, FL is starting to be deployed at a global scale by companies that must adhere to new legal demands and policies originating from governments and social groups advocating for privacy protection. \textit{However, the potential environmental impact related to FL remains unclear and unexplored. This paper offers the first-ever systematic study of the carbon footprint of FL.} First, we propose a rigorous model to quantify the carbon footprint, hence facilitating the investigation of the relationship between FL design and carbon emissions. Then, we compare the carbon footprint of FL to traditional centralized learning. Our findings show that, depending on the configuration, FL can emit up to two order of magnitude more carbon than centralized machine learning. However, in certain settings, it can be comparable to centralized learning due to the reduced energy consumption of embedded devices. We performed extensive experiments across different types of datasets, settings and various deep learning models with FL. Finally, we highlight and connect the reported results to the future challenges and trends in FL to reduce its environmental impact, including algorithms efficiency, hardware capabilities, and stronger industry transparency.
翻译:尽管深度学习技术取得了令人瞩目的成果,但通常在数据中心进行的训练过程也引发了严重的隐私和环境担忧。为此,联邦学习(Federated Learning, FL)等集中式训练的替代方案应运而生。出乎意料的是,随着各国政府及倡导隐私保护的社会团体不断推动新法律与政策的出台,必须遵守这些法规的企业已开始在全球范围内部署联邦学习。然而,联邦学习潜在的环境影响目前仍然不明朗且未得到充分探索。本文首次系统性地研究了联邦学习的碳足迹。首先,我们提出了一种严格的碳足迹量化模型,从而便于探究联邦学习设计与碳排放之间的关系。接着,我们将联邦学习的碳足迹与传统集中式学习进行对比。研究结果表明,根据配置的不同,联邦学习产生的碳排放量可比集中式机器学习高出两个数量级。但在某些场景下,由于嵌入式设备能耗降低,其碳排放量可与集中式学习相当。我们针对不同类型的数据集、设置以及采用联邦学习的多种深度学习模型进行了大量实验。最后,我们重点阐明了上述研究结果与未来降低联邦学习环境影响的挑战和趋势之间的关联,涵盖了算法效率、硬件能力以及更强的行业透明度等方面。