Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English

Despite Spanish's pivotal role in the global finance industry, a pronounced gap exists in Spanish financial natural language processing (NLP) and application studies compared to English, especially in the era of large language models (LLMs). To bridge this gap, we unveil Tois\'on de Oro, the first bilingual framework that establishes instruction datasets, finetuned LLMs, and evaluation benchmark for financial LLMs in Spanish joint with English. We construct a rigorously curated bilingual instruction dataset including over 144K Spanish and English samples from 15 datasets covering 7 tasks. Harnessing this, we introduce FinMA-ES, an LLM designed for bilingual financial applications. We evaluate our model and existing LLMs using FLARE-ES, the first comprehensive bilingual evaluation benchmark with 21 datasets covering 9 tasks. The FLARE-ES benchmark results reveal a significant multilingual performance gap and bias in existing LLMs. FinMA-ES models surpass SOTA LLMs such as GPT-4 in Spanish financial tasks, due to strategic instruction tuning and leveraging data from diverse linguistic resources, highlighting the positive impact of cross-linguistic transfer. All our datasets, models, and benchmarks have been released.

翻译：尽管西班牙语在全球金融行业中具有重要地位，但相较于英语，西班牙语金融自然语言处理（NLP）及应用研究存在显著差距，尤其是在大语言模型（LLMs）时代。为弥合这一差距，我们提出了Toisón de Oro——首个为西班牙语联合英语金融大语言模型建立指令数据集、微调模型及评估基准的双语框架。我们构建了严格策划的双语指令数据集，包含来自15个数据集、覆盖7项任务的14.4万余条西班牙语与英语样本。基于此，我们推出了专为双语金融应用设计的大语言模型FinMA-ES。通过FLARE-ES（首个覆盖9项任务、包含21个数据集的综合双语评估基准），我们对模型及现有大语言模型进行了评测。FLARE-ES基准结果显示，现有大语言模型存在显著的多语言性能差距及偏见。FinMA-ES模型凭借策略性指令微调及对多语言数据资源的利用，在西班牙语金融任务中超越了GPT-4等最先进大语言模型，凸显了跨语言迁移的积极影响。我们的全部数据集、模型及基准均已公开发布。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日