Second-order optimizers can significantly accelerate large-scale training, yet their naive federated variants are often unstable or even diverge on non-IID data. We show that a key culprit is \emph{preconditioner drift}: client-side second-order training induces heterogeneous \emph{curvature-defined geometries} (i.e., preconditioner coordinate systems), and server-side model averaging updates computed under incompatible metrics, corrupting the global descent direction. To address this geometric mismatch, we propose \texttt{FedPAC}, a \emph{preconditioner alignment and correction} framework for reliable federated second-order optimization. \texttt{FedPAC} explicitly decouples parameter aggregation from geometry synchronization by: (i) \textbf{Alignment} (i.e.,aggregating local preconditioners into a global reference and warm-starting clients via global preconditioner); and (ii) \textbf{Correction} (i.e., steering local preconditioned updates using a global preconditioned direction to suppress long-term drift). We provide drift-coupled non-convex convergence guarantees with linear speedup under partial participation. Empirically, \texttt{FedPAC} consistently improves stability and accuracy across vision and language tasks, achieving up to $5.8\%$ absolute accuracy gain on CIFAR-100 with ViTs. Code is available at https://anonymous.4open.science/r/FedPAC-8B24.
翻译:二阶优化器能显著加速大规模训练,但其朴素的联邦变体在非独立同分布数据上往往不稳定甚至发散。我们发现一个关键原因是**预条件子漂移**:客户端二阶训练会诱导异质的**曲率定义几何结构**(即预条件子坐标系),而服务器端模型平均更新在不相容的度量下计算,从而破坏了全局下降方向。为解决这种几何失配问题,我们提出 **FedPAC**,一种用于可靠联邦二阶优化的**预条件子对齐与校正**框架。**FedPAC** 通过以下方式显式解耦参数聚合与几何同步:(i) **对齐**(即聚合局部预条件子为全局参考,并通过全局预条件子热启动客户端);以及 (ii) **校正**(即利用全局预条件方向引导局部预条件更新以抑制长期漂移)。我们在部分参与条件下提供了具有线性加速的漂移耦合非凸收敛保证。实证表明,**FedPAC** 在视觉和语言任务中持续提升稳定性与准确率,在 CIFAR-100 数据集上使用 ViT 模型实现了高达 $5.8\%$ 的绝对准确率提升。代码发布于 https://anonymous.4open.science/r/FedPAC-8B24。