This paper studies stochastic optimization for a sum of compositional functions, where the inner-level function of each summand is coupled with the corresponding summation index. We refer to this family of problems as finite-sum coupled compositional optimization (FCCO). It has broad applications in machine learning for optimizing non-convex or convex compositional measures/objectives such as average precision (AP), p-norm push, listwise ranking losses, neighborhood component analysis (NCA), deep survival analysis, deep latent variable models, etc., which deserves finer analysis. Yet, existing algorithms and analyses are restricted in one or other aspects. The contribution of this paper is to provide a comprehensive convergence analysis of a simple stochastic algorithm for both non-convex and convex objectives. Our key result is the improved oracle complexity with the parallel speed-up by using the moving-average based estimator with mini-batching. Our theoretical analysis also exhibits new insights for improving the practical implementation by sampling the batches of equal size for the outer and inner levels. Numerical experiments on AP maximization, NCA, and p-norm push corroborate some aspects of the theory.
翻译:本文研究复合函数和形式的随机优化问题,其中每个求和项的内层函数与相应的求和索引相耦合。我们将此类问题称为有限和耦合复合优化(FCCO)。该问题在机器学习领域具有广泛的应用,可用于优化非凸或凸复合度量/目标函数,例如平均精度(AP)、p-范数推动、列表排序损失、邻域成分分析(NCA)、深度生存分析、深度潜变量模型等,因此值得进行更深入的分析。然而,现有算法及其分析在某个或多个方面存在局限性。本文的贡献在于,针对非凸和凸目标函数,为一种简单随机算法提供了全面的收敛性分析。我们的关键结果是,通过使用基于移动平均的估计器并结合小批量处理,实现了并行加速并改进了预言复杂度。此外,我们的理论分析还揭示了通过在外层和内层采用相同大小的批次采样来改进实际实现的新思路。关于AP最大化、NCA和p-范数推动的数值实验验证了理论的部分方面。