From Coordinate Matching to Structural Alignment: Rethinking Prototype Alignment in Heterogeneous Federated Learning

Heterogeneous federated learning (HtFL) aims to enable collaboration among clients that differ in both data distributions and model architectures. Prototype-based methods, which communicate class-level feature centers (prototypes) instead of full model parameters, have recently shown strong potential for HtFL. Existing prototype-based HtFL methods typically reuse the MSE-based or cosine-based alignment mechanism developed for homogeneous FL when aligning client-specific representations with global prototypes. These approaches are essentially coordinate alignment, where representations of clients are forced to match the global prototypes in the embedding space in an element-wise manner. Such alignment implicitly assumes that all clients should map their representations into the feature subspace defined by the global prototypes. This assumption is reasonable in homogeneous FL, where all clients share the same feature extractor. However, it becomes problematic in HtFL, since heterogeneous feature extractors naturally induce client-specific feature subspaces, and forcing all clients to optimize within a single global subspace unnecessarily suppresses their learning capacity. We observe that coordinate alignment implicitly couples two distinct objectives: aligning inter-class semantic structure, which is directly beneficial for classification, and enforcing a shared feature basis, which is unnecessary and even harmful under model heterogeneity. Building on this insight, we design FedSAF, which shifts the alignment objective from absolute coordinates to inter-class relational structure. We demonstrate that structural alignment consistently outperforms coordinate alignment in heterogeneous settings. Experiments on multiple benchmarks show that our structural alignment outperforms state-of-the-art prototype-based HtFL methods by up to 3.52\%.

翻译：异构联邦学习（Heterogeneous Federated Learning, HtFL）旨在实现数据分布和模型架构均存在差异的客户端之间的协作。基于原型的方法通过传递类别级特征中心（原型）而非完整的模型参数，近期在HtFL中展现出巨大潜力。现有基于原型的HtFL方法通常在将客户端特定表示与全局原型对齐时，沿用了为同构FL开发的基于MSE或余弦的对齐机制。这些方法本质上是坐标对齐，即强制客户端表示在嵌入空间中逐元素地与全局原型匹配。这种对齐隐式假设所有客户端应将其表示映射到由全局原型定义的特征子空间中。该假设在同构FL中合理（因所有客户端共享同一特征提取器），但在HtFL中存在问题——异构特征提取器自然产生客户端特定的特征子空间，强制所有客户端在单个全局子空间内优化不必要地抑制了它们的学习能力。我们观察到坐标对齐隐含地耦合了两个不同目标：对齐类间语义结构（直接有益于分类）和强制共享特征基（在模型异构场景下不必要甚至有害）。基于此洞察，我们设计了FedSAF，将对齐目标从绝对坐标转向类间关系结构。我们证明结构对齐在异构场景下始终优于坐标对齐。在多个基准上的实验表明，我们的结构对齐方法相比最先进的基于原型的HtFL方法性能提升最高达3.52%。