When post-trained language models fail on reasoning problems, the common test-time-scaling response is to spend more compute on additional attempts, and the failed traces play no further role. We argue this discards a crucial signal; some failures come from unlucky sampling, where more rollouts help, while others are structural and resist resampling regardless of budget. We propose that failed traces encode recoverability structure: the inference-time signature of which test-time interventions can rescue a given failure. Three problem-level trajectory features, derived from the structure of available interventions, recover this structure from the distributional signature of failed rollouts, not their text. They cluster failures into stable regimes, characterize the failure topography of different post-training methods ($84.3{\pm}4.3\%$ accuracy, $+20\%$ over a majority-class baseline), and support a training-free routing rule that lifts rescue by $+12.2\%$ on the deployment-relevant Steerable-Hard subset (failures where retry is insufficient and a bounded intervention is reachable). The features and the routing rule transfer across two cross-family probes. The same three features thus convert failed traces from discarded data into a diagnostic object, supporting test-time routing and post-training analysis without training-time or weight-space access.
翻译:后训练语言模型在推理问题上失败时,常见的测试时扩展策略是投入更多计算资源进行额外尝试,而失败轨迹不再发挥作用。我们认为这丢弃了一个关键信号:部分失败源于采样不充分,增加重复采样即可解决;另一些失败则具有结构性特征,无论采样预算如何增加都无法修正。我们提出失败轨迹编码了可恢复性结构:即哪些测试时干预措施能挽救特定失败的推理时特征信号。基于可用干预措施结构导出的三个问题级轨迹特征,可从失败轨迹的分布特征(而非文本内容)中恢复该结构。这些特征将失败聚类为稳定模式,刻画不同后训练方法的失败地形学(准确率$84.3{\pm}4.3\%$,较多数类基线提升$+20\%$),并支持无训练路由规则,在部署相关的Steerable-Hard子集(因重试不足且存在有界干预可达的失败场景)上将挽救率提升$+12.2\%$。上述特征与路由规则可跨两个交叉族探测器迁移。这三个特征将失败轨迹从废弃数据转化为诊断对象,在不依赖训练时或权重空间访问的情况下,支持测试时路由与后训练分析。