Model evolution enables learning from feedback to refine experiences and update skills, transforming models from having no domain knowledge to becoming domain experts. However, there is currently no unified and effective method for guiding this evolutionary process. To address this gap, we propose the Meteor method, which includes three training phases: weak-to-strong data distillation, iterative training, and self-evolution strategies. Each phase maximizes the model's inherent domain capabilities, allowing it to autonomously refine its domain knowledge and enhance performance. Experiments demonstrate that our approach significantly improves accuracy, completeness, relevance, coherence, and reliability across domain-specific tasks.
翻译:模型进化使得模型能够从反馈中学习,以精炼经验并更新技能,从而将模型从缺乏领域知识转变为领域专家。然而,目前尚无统一且有效的方法来指导这一进化过程。为填补这一空白,我们提出了Meteor方法,该方法包含三个训练阶段:从弱到强的数据蒸馏、迭代训练以及自我进化策略。每个阶段都旨在最大化模型固有的领域能力,使其能够自主精炼领域知识并提升性能。实验表明,我们的方法在特定领域任务中显著提高了准确性、完整性、相关性、连贯性和可靠性。