We introduce Nova, a suite of practical alignment techniques employed in a series of empirically validated high-performing models. This represents the first comprehensive account of alignment methodologies, offering valuable insights for advancing AI research. We investigate the critical components that enhance model performance during the alignment process, including optimization methods, data strategies, capability enhancements, and evaluation processes. The process spans three key stages: Prompt Augmentation System(PAS), Supervised Fine-Tuning(SFT), and Preference Alignment. The problems encountered, the solutions applied, and the improvements made are thoroughly recorded. Through comparisons across well-established benchmarks, we highlight the technological advancements enabled by Nova Alignment. Importantly, Qwen2-Nova-72B and Llama3-PBM-Nova-70B are instruct versions of the Qwen2-72B and Llama-3-70B base models, optimized through Nova. The Nova models show significant core improvements, with user experience gains of 17% to 28%, and excels on specialized benchmarks. In open-source benchmark evaluations, both Qwen2-Nova-72B and Llama3-PBM-Nova-70B consistently outperform their respective official instruct versions across nearly all datasets. This report aims to clarify the key technologies behind the alignment process, fostering a deeper understanding within the community. Llama3-PBM-Nova-70B model is available at https://huggingface.co/PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B.
翻译:本文介绍了Nova,一套应用于系列经过实证验证的高性能模型中的实用对齐技术。这首次系统阐述了模型对齐方法论,为推进人工智能研究提供了宝贵洞见。我们深入研究了在模型对齐过程中提升性能的关键组成部分,包括优化方法、数据策略、能力增强和评估流程。该过程涵盖三个关键阶段:提示增强系统、监督微调和偏好对齐。文中详尽记录了遇到的问题、应用的解决方案以及取得的改进。通过在多类成熟基准测试上的对比,我们凸显了Nova对齐技术带来的技术进步。值得注意的是,Qwen2-Nova-72B和Llama3-PBM-Nova-70B分别是Qwen2-72B和Llama-3-70B基础模型经过Nova优化后的指令调优版本。Nova模型展现出显著的核心性能提升,用户体验增益达17%至28%,并在专业基准测试中表现优异。在开源基准评估中,Qwen2-Nova-72B和Llama3-PBM-Nova-70B在几乎所有数据集上均持续超越其各自的官方指令版本。本报告旨在阐明对齐过程背后的关键技术,促进学界更深入的理解。Llama3-PBM-Nova-70B模型发布于https://huggingface.co/PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B。