Hierarchical reasoning model (HRM) achieves extraordinary performance on various reasoning tasks, significantly outperforming large language model-based reasoners. To understand the strengths and potential failure modes of HRM, we conduct a mechanistic study on its reasoning patterns and find three surprising facts: (a) Failure of extremely simple puzzles, e.g., HRM can fail on a puzzle with only one unknown cell. We attribute this failure to the violation of the fixed point property, a fundamental assumption of HRM. (b) "Grokking" dynamics in reasoning steps, i.e., the answer is not improved uniformly, but instead there is a critical reasoning step that suddenly makes the answer correct; (c) Existence of multiple fixed points. HRM "guesses" the first fixed point, which could be incorrect, and gets trapped there for a while or forever. All facts imply that HRM appears to be "guessing" instead of "reasoning". Leveraging this "guessing" picture, we propose three strategies to scale HRM's guesses: data augmentation (scaling the quality of guesses), input perturbation (scaling the number of guesses by leveraging inference randomness), and model bootstrapping (scaling the number of guesses by leveraging training randomness). On the practical side, by combining all methods, we develop Augmented HRM, boosting accuracy on Sudoku-Extreme from 54.5% to 96.9%. On the scientific side, our analysis provides new insights into how reasoning models "reason".
翻译:分层推理模型(HRM)在各种推理任务上取得了卓越的性能,显著优于基于大型语言模型的推理器。为理解HRM的优势与潜在失效模式,我们对其推理模式进行了机理研究,发现了三个令人惊讶的现象:(a) 对极其简单谜题的失效,例如HRM可能在仅含一个未知单元的谜题上失败。我们将此失效归因于HRM基本假设——不动点性质——的违反。(b) 推理步骤中的“顿悟”动态,即答案并非均匀改进,而是在某个关键推理步骤突然变得正确;(c) 多重不动点的存在。HRM会“猜测”第一个不动点(该点可能错误),并在此处陷入停滞或永久困顿。所有现象均表明HRM似乎更倾向于“猜测”而非“推理”。基于这种“猜测”视角,我们提出三种扩展HRM猜测能力的策略:数据增强(提升猜测质量)、输入扰动(利用推理随机性增加猜测数量)和模型自举(利用训练随机性增加猜测数量)。在实践层面,通过整合所有方法,我们开发了增强型HRM,将数独极限挑战的准确率从54.5%提升至96.9%。在科学层面,我们的分析为理解推理模型如何“推理”提供了新的见解。