To comply with new legal requirements and policies committed to privacy protection, more and more companies start to deploy cross-silo Federated Learning at global scale, where several clients/silos collaboratively train a global model under the coordination of a central server. Instead of data sharing and transmission, clients train models using their private local data and exchange model updates. However, there is little understanding of the carbon emission impact of cross silo Federated Learning due to the lack of related works. In this study, we first analyze the sustainability aspect of cross-silo Federated Learning, across the AI product life cycle instead of focusing only on the model training, with the comparison to the centralized method. A more holistic quantitative cost and CO2 emission estimation method for real world cross-silo Federated Learning setting is proposed. Secondly, we propose a novel data and application management system using cross silo Federated Learning and analytics to make IT companies more sustainable and cost effective.
翻译:为了遵守新的隐私保护法律要求与政策承诺,越来越多的企业开始在全球范围内部署跨孤岛联邦学习,即多个客户端/孤岛在中央服务器的协调下协作训练全局模型。与数据共享和传输不同,客户端使用私有本地数据训练模型并交换模型更新。然而,由于缺乏相关研究,人们对跨孤岛联邦学习的碳排放影响知之甚少。本研究首先从人工智能产品全生命周期(而非仅聚焦模型训练)出发,通过与集中式方法的对比,分析跨孤岛联邦学习的可持续性,并提出一种针对现实世界跨孤岛联邦学习场景的更全面的量化成本与二氧化碳排放估算方法。其次,我们提出一种利用跨孤岛联邦学习与分析的新型数据与应用管理系统,旨在提升IT企业的可持续性与成本效益。