We present a novel theoretical analysis of Federated SARSA (FedSARSA) with linear function approximation and local training. We establish convergence guarantees for FedSARSA in the presence of heterogeneity, both in local transitions and rewards, providing the first sample and communication complexity bounds in this setting. At the core of our analysis is a new, exact multi-step error expansion for single-agent SARSA, which is of independent interest. Our analysis precisely quantifies the impact of heterogeneity, demonstrating the convergence of FedSARSA with multiple local updates. Crucially, we show that FedSARSA achieves linear speed-up with respect to the number of agents, up to higher-order terms due to Markovian sampling. Numerical experiments support our theoretical findings.
翻译:本文提出了一种基于线性函数逼近与本地训练的联邦SARSA(FedSARSA)新理论分析。我们在局部状态转移与奖励函数均存在异构性的条件下,建立了FedSARSA的收敛性保证,首次给出了该设定下的样本复杂度与通信复杂度边界。我们分析的核心是单智能体SARSA的一种新颖且精确的多步误差展开方法,该方法本身具有独立的理论价值。我们的分析精确量化了异构性对算法的影响,证明了FedSARSA在多次本地更新下的收敛性。关键结论表明,在马尔可夫采样引入的高阶误差项范围内,FedSARSA能够实现与智能体数量成比例的线性加速。数值实验验证了我们的理论结果。