Existing decentralized algorithms usually require knowledge of problem parameters for updating local iterates. For example, the hyperparameters (such as learning rate) usually require the knowledge of Lipschitz constant of the global gradient or topological information of the communication networks, which are usually not accessible in practice. In this paper, we propose D-NASA, the first algorithm for decentralized nonconvex stochastic optimization that requires no prior knowledge of any problem parameters. We show that D-NASA has the optimal rate of convergence for nonconvex objectives under very mild conditions and enjoys the linear-speedup effect, i.e. the computation becomes faster as the number of nodes in the system increases. Extensive numerical experiments are conducted to support our findings.
翻译:现有分布式算法通常需要知道问题参数来更新局部迭代点。例如,超参数(如学习率)通常需要全局梯度的Lipschitz常数或通信网络的拓扑信息,而这些在实际中往往无法获取。本文提出D-NASA,这是首个无需预知任何问题参数的分布式非凸随机优化算法。我们证明,在非常温和的条件下,D-NASA对非凸目标函数具有最优收敛速率,并享有线性加速效应——即随着系统节点数量增加,计算速度会相应提升。大量数值实验验证了我们的研究结果。