In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
翻译:本文开发并发布了Llama 2——一系列参数规模从70亿到700亿的预训练及微调大型语言模型(LLM)。经微调的LLM(称为Llama 2-Chat)针对对话场景进行了优化。在大多数测试基准上,我们的模型性能均优于开源对话模型;基于对有用性与安全性的人工评估,其有望成为闭源模型的合适替代方案。我们详细阐述了Llama 2-Chat的微调方法及安全改进策略,旨在推动学术界基于本工作开展后续研究,共同促进LLM的负责任发展。