The stochastic compositional minimax problem has attracted a surge of attention in recent years since it covers many emerging machine learning models. Meanwhile, due to the emergence of distributed data, optimizing this kind of problem under the decentralized setting becomes badly needed. However, the compositional structure in the loss function brings unique challenges to designing efficient decentralized optimization algorithms. In particular, our study shows that the standard gossip communication strategy cannot achieve linear speedup for decentralized compositional minimax problems due to the large consensus error about the inner-level function. To address this issue, we developed a novel decentralized stochastic compositional gradient descent ascent with momentum algorithm to reduce the consensus error in the inner-level function. As such, our theoretical results demonstrate that it is able to achieve linear speedup with respect to the number of workers. We believe this novel algorithmic design could benefit the development of decentralized compositional optimization. Finally, we applied our methods to the imbalanced classification problem. The extensive experimental results provide evidence for the effectiveness of our algorithm.
翻译:近年来,随机复合极小极大问题因其涵盖众多新兴机器学习模型而备受关注。同时,随着分布式数据的出现,在去中心化环境下优化此类问题的需求日益迫切。然而,损失函数中的复合结构为设计高效的去中心化优化算法带来了独特挑战。特别是,我们的研究表明,由于内层函数存在较大的共识误差,标准八卦通信策略无法实现去中心化复合极小极大问题的线性加速。为解决这一问题,我们提出了一种新颖的去中心化随机复合梯度下降上升动量算法,以降低内层函数中的共识误差。理论结果表明,该算法能够实现与工作节点数量成比例的线性加速。我们相信,这一新型算法设计将有益于去中心化复合优化的发展。最后,我们将所提方法应用于非平衡分类问题,大量实验结果证明了算法的有效性。