We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while reaping the benefits of distributed computation, we propose \texttt{DASA}, a Delay-Adaptive algorithm for multi-agent Stochastic Approximation. We provide a finite-time analysis of \texttt{DASA} assuming that the agents' stochastic observation processes are independent Markov chains. Significantly advancing existing results, \texttt{DASA} is the first algorithm whose convergence rate depends only on the mixing time $\tau_{mix}$ and on the average delay $\tau_{avg}$ while jointly achieving an $N$-fold convergence speedup under Markovian sampling. Our work is relevant for various SA applications, including multi-agent and distributed temporal difference (TD) learning, Q-learning and stochastic optimization with correlated data.
翻译:我们考虑如下场景:$N$个智能体通过并行操作并与中央服务器通信来共同加速一个随机逼近(Stochastic Approximation, SA)问题的求解。假设上行链路传输至服务器面临异步且可能无界时变延迟。为在利用分布式计算优势的同时减轻延迟和掉队者效应,我们提出\texttt{DASA}——一种延迟自适应的多智能体随机逼近算法。我们基于智能体随机观测过程为独立马尔可夫链的假设,对\texttt{DASA}进行有限时间分析。通过显著推进现有研究,\texttt{DASA}成为首个收敛速度仅依赖于混合时间$\tau_{mix}$与平均延迟$\tau_{avg}$,并在马尔可夫采样下联合实现$N$倍收敛加速的算法。我们的工作对多种SA应用具有参考价值,包括多智能体与分布式时序差分(TD)学习、Q学习和基于相关数据的随机优化。