Neural algorithmic reasoning (NAR) is a paradigm that trains neural networks to execute classic algorithms by supervised learning. Despite its successes, important limitations remain: inability to construct valid solutions without post-processing and to reason about multiple correct ones, poor performance on combinatorial NP-hard problems, and inapplicability to problems for which strong algorithms are not yet known. To address these limitations, we reframe the problem of learning algorithm trajectories as a Markov decision process, which imposes structure on the solution construction procedure and unlocks the powerful tools of imitation and reinforcement learning (RL). We propose the GNARL framework, encompassing the methodology to translate problem formulations from NAR to RL and a learning architecture suitable for a wide range of graph-based problems. We achieve very high graph accuracy results on several CLRS-30 problems, performance matching or exceeding much narrower NAR approaches for NP-hard problems and, remarkably, applicability even when lacking an expert algorithm.
翻译:神经算法推理(NAR)是一种通过监督学习训练神经网络执行经典算法的范式。尽管取得了成功,但该方法仍存在重要局限:无法在没有后处理的情况下构建有效解,无法对多个正确解进行推理,在组合NP困难问题上表现不佳,且无法应用于尚未掌握强算法的问题。为克服这些局限,我们将学习算法轨迹的问题重新表述为马尔可夫决策过程,这为解构造过程施加了结构约束,并解锁了模仿学习与强化学习(RL)的强大工具。我们提出GNARL框架,包含将NAR问题形式转换为RL问题的方法论,以及适用于广泛图基问题的学习架构。我们在多个CLRS-30问题上取得了极高的图准确率结果,在NP困难问题上的性能达到甚至超越具有更狭窄适用范围的NAR方法,尤其值得关注的是,即便在缺乏专家算法时仍具备适用性。