Large language models (LLMs) have demonstrated strong capabilities in language understanding, generation, and reasoning, yet their potential in finance remains underexplored due to the complexity and specialization of financial knowledge. In this work, we report the development of the Baichuan4-Finance series, including a comprehensive suite of foundational Baichuan4-Finance-Base and an aligned language model Baichuan4-Finance, which are built upon Baichuan4-Turbo base model and tailored for finance domain. Firstly, we have dedicated significant effort to building a detailed pipeline for improving data quality. Moreover, in the continual pre-training phase, we propose a novel domain self-constraint training strategy, which enables Baichuan4-Finance-Base to acquire financial knowledge without losing general capabilities. After Supervised Fine-tuning and Reinforcement Learning from Human Feedback and AI Feedback, the chat model Baichuan4-Finance is able to tackle various financial certification questions and real-world scenario applications. We evaluate Baichuan4-Finance on many widely used general datasets and two holistic financial benchmarks. The evaluation results show that Baichuan4-Finance-Base surpasses almost all competitive baselines on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. At the same time, Baichuan4-Finance demonstrates even more impressive performance on financial application scenarios, showcasing its potential to foster community innovation in the financial LLM field.
翻译:大语言模型(LLMs)在语言理解、生成和推理方面已展现出强大能力,但由于金融知识的复杂性和专业性,其在金融领域的潜力仍未得到充分探索。本工作介绍了 Baichuan4-Finance 系列的开发,包括一套全面的基础模型 Baichuan4-Finance-Base 及其对齐后的对话模型 Baichuan4-Finance。该系列基于 Baichuan4-Turbo 基础模型,并针对金融领域进行了专门定制。首先,我们投入大量精力构建了一套用于提升数据质量的详细流程。此外,在持续预训练阶段,我们提出了一种新颖的领域自约束训练策略,使 Baichuan4-Finance-Base 能够在不损失通用能力的前提下获取金融知识。经过监督微调以及基于人类反馈和人工智能反馈的强化学习后,对话模型 Baichuan4-Finance 能够应对各类金融认证考试题目和真实场景应用。我们在多个广泛使用的通用数据集以及两个综合性金融基准上对 Baichuan4-Finance 进行了评估。评估结果表明,Baichuan4-Finance-Base 在金融任务上以显著优势超越了几乎所有竞争基线,同时并未牺牲其在通用大语言模型基准上的性能。与此同时,Baichuan4-Finance 在金融应用场景中表现出更为出色的性能,展现了其在推动金融大语言模型领域社区创新方面的潜力。