Modern neural network architectures still struggle to learn algorithmic procedures that require to systematically apply compositional rules to solve out-of-distribution problem instances. In this work, we propose an original approach to learn algorithmic tasks inspired by rewriting systems, a classic framework in symbolic artificial intelligence. We show that a rewriting system can be implemented as a neural architecture composed by specialized modules: the Selector identifies the target sub-expression to process, the Solver simplifies the sub-expression by computing the corresponding result, and the Combiner produces a new version of the original expression by replacing the sub-expression with the solution provided. We evaluate our model on three types of algorithmic tasks that require simplifying symbolic formulas involving lists, arithmetic, and algebraic expressions. We test the extrapolation capabilities of the proposed architecture using formulas involving a higher number of operands and nesting levels than those seen during training, and we benchmark its performance against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art large language model (GPT-4) probed with advanced prompting strategies.
翻译:现代神经网络架构在学习需要系统性应用组合规则来解决分布外问题实例的算法过程时仍面临挑战。本文提出了一种受重写系统(符号人工智能中的经典框架)启发的原创方法,用于学习算法任务。我们证明,重写系统可通过由专用模块组成的神经架构实现:选择器(Selector)识别待处理的目标子表达式,求解器(Solver)通过计算对应结果简化子表达式,组合器(Combiner)则用解替换子表达式生成原始表达式的新版本。我们在涉及列表、算术和代数表达式符号公式简化的三类算法任务上评估了模型。我们使用训练中未见过的包含更多操作数和嵌套层级的公式测试所提架构的外推能力,并将其性能与专门用于系统性泛化的最新模型Neural Data Router、以及采用先进提示策略的先进大语言模型(GPT-4)进行基准对比。