This work introduces Weaver, our first family of large language models (LLMs) dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of large language models. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for instruction data synthesis and LLM alignment, making it able to produce more human-like texts and follow more diverse instructions for content creation. The Weaver family consists of models of Weaver Mini (1.8B), Weaver Base (6B), Weaver Pro (14B), and Weaver Ultra (34B) sizes, suitable for different applications and can be dynamically dispatched by a routing agent according to query complexity to balance response quality and computation cost. Evaluation on a carefully curated benchmark for assessing the writing capabilities of LLMs shows Weaver models of all sizes outperform generalist LLMs several times larger than them. Notably, our most-capable Weaver Ultra model surpasses GPT-4, a state-of-the-art generalist LLM, on various writing scenarios, demonstrating the advantage of training specialized LLMs for writing purposes. Moreover, Weaver natively supports retrieval-augmented generation (RAG) and function calling (tool usage). We present various use cases of these abilities for improving AI-assisted writing systems, including integration of external knowledge bases, tools, or APIs, and providing personalized writing assistance. Furthermore, we discuss and summarize a guideline and best practices for pre-training and fine-tuning domain-specific LLMs.
翻译:本文介绍了Weaver,这是我们的首个专注于内容创作的大型语言模型(LLM)系列。Weaver基于一个精心筛选的语料库进行预训练,该语料库旨在提升大型语言模型的写作能力。随后,我们针对创意及专业写作场景对Weaver进行微调,并通过一套新颖的指令数据合成与LLM对齐方法,使其与专业写作者的偏好对齐,从而能够生成更具人类风格的文本,并遵循更多样化的内容创作指令。Weaver系列包含四种尺寸的模型:Weaver Mini(1.8B)、Weaver Base(6B)、Weaver Pro(14B)和Weaver Ultra(34B),适用于不同应用场景,并可通过路由智能体根据查询复杂度动态调配,以平衡响应质量与计算成本。在精心构建的LLM写作能力评估基准上,所有尺寸的Weaver模型均优于规模数倍于自身的通用型LLM。值得注意的是,我们性能最强的Weaver Ultra模型在多种写作场景中超越了当前最先进的通用型LLM GPT-4,展示了训练专用写作LLM的优势。此外,Weaver原生支持检索增强生成(RAG)与函数调用(工具使用)。我们展示了这些能力在改进AI辅助写作系统中的应用案例,包括集成外部知识库、工具或API,以及提供个性化写作辅助。更进一步,我们探讨并总结了预训练及微调领域专用LLM的指导原则与最佳实践。