Synthetic post-training pipelines commonly filter generated samples with reward models or holistic LLM judges, yet two practices remain rarely examined together: whether the filtering signal is grounded in the source evidence that induced each generation, and whether rejected samples can be systematically recovered rather than permanently discarded. We present a controlled study of both questions across gate configurations, recovery strategies, and generator scales, using adversarially injected corpora to provide ground-truth failure labels. We find that exact source provenance improves faithfulness gating for stronger judges, that hallucination and reward gates reject largely disjoint sample populations making both necessary, and that an adaptive recovery pipeline combining failure diagnosis with targeted regeneration achieves higher yield, recovery rate, and injection recall than naive resampling. Downstream fine-tuning quality is driven primarily by generator scale, with filtration and recovery conditions contributing meaningfully but secondarily.
翻译:合成后训练流程通常使用奖励模型或整体大语言模型裁判过滤生成的样本,但以下两种实践很少被一同考察:过滤信号是否锚定于引发每个生成的源证据,以及被拒绝的样本能否被系统性地恢复而非永久丢弃。我们通过使用对抗性注入的语料库提供真实失败标签,对门控配置、恢复策略和生成器规模这两个问题进行了受控研究。我们发现,精确的源来源可改善较弱裁判的忠诚度门控;幻觉门与奖励门拒绝的样本群体基本不重叠,因此两者均不可或缺;以及一种结合故障诊断与定向再生成的自适应恢复流程,相较于简单重采样,能实现更高的产出率、恢复率与注入召回率。下游微调质量主要由生成器规模驱动,过滤与恢复条件虽起重要作用,但属于次要因素。