Weak-to-Strong generalization (W2SG) is a new trend to elicit the full capabilities of a strong model with supervision from a weak model. While existing W2SG studies focus on simple tasks like binary classification, we extend this paradigm to complex interactive decision-making environments. Specifically, we fine-tune a strong model with trajectories of intermediate actions generated by a weak model. Motivated by the human learning process, we propose to generalize not only success knowledge but also failure experience so that the strong model can learn from failed trajectories accumulated by weak models. To effectively and efficiently elicit the potential of strong agents, we further construct ``trajectory trees," a hierarchical representation that organizes weak model-generated action trajectories, coupled with Monte Carlo Tree Search (MCTS) to optimize the strong model. Through theoretical analysis, we provide formal guarantees for the effectiveness of our method in improving W2SG performance. Our empirical evaluations demonstrate substantial improvements in reasoning and decision-making capabilities across diverse task domains, validating the scalability and robustness of our proposed framework.
翻译:弱到强泛化(W2SG)是一种利用弱模型监督来激发强模型全部能力的新趋势。现有W2SG研究主要关注二元分类等简单任务,而我们将此范式扩展到复杂的交互式决策环境中。具体而言,我们利用弱模型生成的中间动作轨迹对强模型进行微调。受人类学习过程的启发,我们提出不仅要泛化成功知识,还要泛化失败经验,使强模型能够从弱模型积累的失败轨迹中学习。为了有效且高效地激发智能体的潜力,我们进一步构建了“轨迹树”——一种组织弱模型生成动作轨迹的层次化表示结构,并结合蒙特卡洛树搜索(MCTS)来优化强模型。通过理论分析,我们为该方法在提升W2SG性能方面的有效性提供了形式化保证。实证评估表明,该方法在多样化任务领域中显著提升了推理与决策能力,验证了所提出框架的可扩展性与鲁棒性。