Neural algorithmic reasoning is an emerging research direction that endows neural networks with the ability to mimic algorithmic executions step-by-step. A common paradigm in existing designs involves the use of historical embeddings in predicting the results of future execution steps. Our observation in this work is that such historical dependence intrinsically contradicts the Markov nature of algorithmic reasoning tasks. Based on this motivation, we present our ForgetNet, which does not use historical embeddings and thus is consistent with the Markov nature of the tasks. To address challenges in training ForgetNet at early stages, we further introduce G-ForgetNet, which uses a gating mechanism to allow for the selective integration of historical embeddings. Such an enhanced capability provides valuable computational pathways during the model's early training phase. Our extensive experiments, based on the CLRS-30 algorithmic reasoning benchmark, demonstrate that both ForgetNet and G-ForgetNet achieve better generalization capability than existing methods. Furthermore, we investigate the behavior of the gating mechanism, highlighting its degree of alignment with our intuitions and its effectiveness for robust performance.
翻译:神经算法推理是一个新兴的研究方向,旨在赋予神经网络逐步模拟算法执行的能力。现有设计的一个常见范式是在预测未来执行步骤结果时使用历史嵌入。我们在此工作中的观察是,这种历史依赖性本质上与算法推理任务的马尔可夫性质相矛盾。基于这一动机,我们提出了ForgetNet,它不使用历史嵌入,从而与任务的马尔可夫性质保持一致。为了解决ForgetNet在早期训练阶段的挑战,我们进一步引入了G-ForgetNet,它采用门控机制以允许选择性整合历史嵌入。这种增强能力为模型早期训练阶段提供了有价值的计算路径。基于CLRS-30算法推理基准的大规模实验表明,ForgetNet和G-ForgetNet在泛化能力上均优于现有方法。此外,我们研究了门控机制的行为,揭示了其与直觉的契合程度及其对鲁棒性能的有效性。