The promise and proliferation of large-scale dynamic federated learning gives rise to a prominent open question - is it prudent to share data or model across nodes, if efficiency of transmission and fast knowledge transfer are the prime objectives. This work investigates exactly that. Specifically, we study the choices of exchanging raw data, synthetic data, or (partial) model updates among devices. The implications of these strategies in the context of foundational models are also examined in detail. Accordingly, we obtain key insights about optimal data and model exchange mechanisms considering various environments with different data distributions and dynamic device and network connections. Across various scenarios that we considered, time-limited knowledge transfer efficiency can differ by up to 9.08\%, thus highlighting the importance of this work.
翻译:大规模动态联邦学习的承诺与普及引发了一个突出的开放性问题——若传输效率与快速知识迁移是首要目标,在节点间共享数据还是模型更为审慎?本研究正是针对这一问题展开探讨。具体而言,我们研究了在设备间交换原始数据、合成数据或(部分)模型更新的选择方案,并详细分析了这些策略在基础模型背景下的影响。基于此,我们针对具有不同数据分布及动态设备与网络连接的各种环境,获得了关于最优数据与模型交换机制的关键见解。在我们考察的各种场景中,时间受限下的知识迁移效率差异最高可达9.08%,这凸显了本研究的重要性。