Deep Neural Networks and Reinforcement Learning methods have empirically shown great promise in tackling challenging combinatorial problems. In those methods a deep neural network is used as a solution generator which is then trained by gradient-based methods (e.g., policy gradient) to successively obtain better solution distributions. In this work we introduce a novel theoretical framework for analyzing the effectiveness of such methods. We ask whether there exist generative models that (i) are expressive enough to generate approximately optimal solutions; (ii) have a tractable, i.e, polynomial in the size of the input, number of parameters; (iii) their optimization landscape is benign in the sense that it does not contain sub-optimal stationary points. Our main contribution is a positive answer to this question. Our result holds for a broad class of combinatorial problems including Max- and Min-Cut, Max-$k$-CSP, Maximum-Weight-Bipartite-Matching, and the Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla gradient descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
翻译:深度神经网络与强化学习方法在解决具有挑战性的组合问题时,已在经验层面展现出巨大潜力。这类方法将深度神经网络作为解生成器,随后通过基于梯度的优化方法(如策略梯度)进行训练,逐步获取更优的解分布。本文提出了一种新颖的理论框架,用于分析此类方法的有效性。我们探究以下问题:是否存在一种生成模型,其(i)具备足够表达能力以生成近似最优解;(ii)参数数量与输入规模呈多项式关系,即具有可计算性;(iii)其优化地形是良性的,即不含次优驻点。我们的主要贡献在于对上述问题给出了肯定回答。该结论适用于包括最大割/最小割问题、最大-$k$-约束满足问题、最大权重二分图匹配问题及旅行商问题在内的广泛组合问题类别。作为分析的衍生成果,我们提出了一种针对原始梯度下降的新型正则化过程,并通过理论与实验证据表明,该过程有助于缓解梯度消失问题并规避不良驻点。