The stochastic compositional minimax problem has attracted a surge of attention in recent years since it covers many emerging machine learning models. Meanwhile, due to the emergence of distributed data, optimizing this kind of problem under the decentralized setting becomes badly needed. However, the compositional structure in the loss function brings unique challenges to designing efficient decentralized optimization algorithms. In particular, our study shows that the standard gossip communication strategy cannot achieve linear speedup for decentralized compositional minimax problems due to the large consensus error about the inner-level function. To address this issue, we developed a novel decentralized stochastic compositional gradient descent ascent with momentum algorithm to reduce the consensus error in the inner-level function. As such, our theoretical results demonstrate that it is able to achieve linear speedup with respect to the number of workers. We believe this novel algorithmic design could benefit the development of decentralized compositional optimization. Finally, we applied our methods to the imbalanced classification problem. The extensive experimental results provide evidence for the effectiveness of our algorithm.
翻译:近年来,随机复合极小极大问题因其涵盖众多新兴机器学习模型而受到广泛关注。与此同时,随着分布式数据的涌现,在此类问题的去中心化设置下进行优化变得尤为迫切。然而,损失函数中的复合结构为设计高效的去中心化优化算法带来了独特挑战。具体而言,我们的研究表明,由于内层函数的共识误差较大,标准去中心化通信策略无法在分布式复合极小极大问题中实现线性加速。为解决这一问题,我们提出了一种新型的去中心化随机复合梯度下降上升动量算法以降低内层函数的共识误差。理论结果表明,该算法能够实现与工作者数量相关的线性加速。我们相信这一新型算法设计将有助于去中心化复合优化的研究发展。最后,我们将该方法应用于非平衡分类问题,大量实验结果为算法的有效性提供了证据。