Most personalised federated learning (FL) approaches assume that raw data of all clients are defined in a common subspace i.e. all clients store their data according to the same schema. For real-world applications, this assumption is restrictive as clients, having their own systems to collect and then store data, may use heterogeneous data representations. We aim at filling this gap. To this end, we propose a general framework coined FLIC that maps client's data onto a common feature space via local embedding functions. The common feature space is learnt in a federated manner using Wasserstein barycenters while the local embedding functions are trained on each client via distribution alignment. We integrate this distribution alignement mechanism into a federated learning approach and provide the algorithmics of FLIC. We compare its performances against FL benchmarks involving heterogeneous input features spaces. In addition, we provide theoretical insights supporting the relevance of our methodology.
翻译:大多数个性化联邦学习方法假设所有客户端的原始数据定义在公共子空间中,即所有客户端按同一模式存储数据。然而在现实应用中,由于客户端使用各自的数据采集与存储系统,可能采用异质化的数据表示形式,这一假设具有局限性。为填补这一空白,我们提出通用框架FLIC,通过局部嵌入函数将客户端数据映射至公共特征空间。该公共特征空间采用Wasserstein重心以联邦方式学习,而局部嵌入函数则通过分布对齐技术在各客户端单独训练。我们将这种分布对齐机制融入联邦学习过程,并给出FLIC的算法实现。与涉及异质输入特征空间的联邦学习基准相比,我们评估了其性能表现。此外,我们还提供了支撑该方法有效性的理论洞见。