FinTeamExperts: Role Specialized MOEs For Financial Analysis

Large Language Models (LLMs), such as ChatGPT, Phi3 and Llama-3, are leading a significant leap in AI, as they can generalize knowledge from their training to new tasks without fine-tuning. However, their application in the financial domain remains relatively limited. The financial field is inherently complex, requiring a deep understanding across various perspectives, from macro, micro economic trend to quantitative analysis. Motivated by this complexity, a mixture of expert LLMs tailored to specific financial domains could offer a more comprehensive understanding for intricate financial tasks. In this paper, we present the FinTeamExperts, a role-specialized LLM framework structured as a Mixture of Experts (MOEs) for financial analysis. The framework simulates a collaborative team setting by training each model to specialize in distinct roles: Macro Analysts, Micro analysts, and Quantitative Analysts. This role-specific specialization enhances the model's ability to integrate their domain-specific expertise. We achieve this by training three 8-billion parameter models on different corpus, each dedicated to excelling in specific finance-related roles. We then instruct-tune FinTeamExperts on downstream tasks to align with practical financial tasks. The experimental results show that FinTeamExperts outperform all models of the same size and larger on three out of four datasets. On the fourth dataset, which presents a more complex task, FinTeamExperts still surpass all models of the same size. This highlights the success of our role-based specialization approach and the continued training approach for FinTeamExperts.

翻译：以ChatGPT、Phi3和Llama-3为代表的大语言模型（LLMs）正在引领人工智能领域的重大飞跃，它们能够将训练中获得的知识泛化至新任务而无需微调。然而，其在金融领域的应用仍相对有限。金融领域本身具有高度复杂性，需要从宏观经济趋势、微观经济分析到量化分析等多视角的深刻理解。受此复杂性启发，针对特定金融领域定制的专家混合大语言模型可为复杂的金融任务提供更全面的理解。本文提出FinTeamExperts，一个为金融分析设计的、采用专家混合（MOEs）架构的角色专业化大语言模型框架。该框架通过训练每个模型专精于不同角色——宏观分析师、微观分析师和量化分析师，模拟了协同团队的工作模式。这种角色特化增强了模型整合其领域专业知识的能力。我们通过在三个不同语料库上分别训练三个80亿参数模型来实现这一点，每个模型致力于在特定金融相关角色中表现出色。随后，我们在下游任务上对FinTeamExperts进行指令微调，以使其与实际金融任务对齐。实验结果表明，在四个数据集中的三个上，FinTeamExperts的表现均优于同规模及更大规模的所有模型。在第四个呈现更复杂任务的数据集上，FinTeamExperts仍超越所有同规模模型。这凸显了我们基于角色的专业化方法以及FinTeamExperts持续训练策略的成功。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日