The current state-of-the-art decentralized learning algorithms mostly assume the data distribution to be Independent and Identically Distributed (IID). However, in practical scenarios, the distributed datasets can have significantly heterogeneous data distributions across the agents. In this work, we present a novel approach for decentralized learning on heterogeneous data, where data-free knowledge distillation through contrastive loss on cross-features is utilized to improve performance. Cross-features for a pair of neighboring agents are the features (i.e., last hidden layer activations) obtained from the data of an agent with respect to the model parameters of the other agent. We demonstrate the effectiveness of the proposed technique through an exhaustive set of experiments on various Computer Vision datasets (CIFAR-10, CIFAR-100, Fashion MNIST, and ImageNet), model architectures, and network topologies. Our experiments show that the proposed method achieves superior performance (0.2-4% improvement in test accuracy) compared to other existing techniques for decentralized learning on heterogeneous data.
翻译:当前最先进的去中心化学习算法大多假设数据分布为独立同分布(IID)。然而,在实际场景中,分布式数据集在不同智能体之间可能呈现显著异构的数据分布。本文提出了一种面向异构数据去中心化学习的新方法,通过利用跨特征对比损失实现无数据知识蒸馏以提升性能。对于一对相邻智能体,跨特征是指一个智能体基于另一智能体模型参数从其数据中提取的特征(即最后一个隐藏层激活值)。我们通过对多种计算机视觉数据集(CIFAR-10、CIFAR-100、Fashion MNIST和ImageNet)、模型架构以及网络拓扑结构进行 exhaustive 实验,证明了所提技术的有效性。实验结果表明,与现有其他异构数据去中心化学习方法相比,本文方法实现了更优性能(测试准确率提升0.2%-4%)。