Large language models (LLMs) have shown the potential of revolutionizing natural language processing tasks in diverse domains, sparking great interest in finance. Accessing high-quality financial data is the first challenge for financial LLMs (FinLLMs). While proprietary models like BloombergGPT have taken advantage of their unique data accumulation, such privileged access calls for an open-source alternative to democratize Internet-scale financial data. In this paper, we present an open-source large language model, FinGPT, for the finance sector. Unlike proprietary models, FinGPT takes a data-centric approach, providing researchers and practitioners with accessible and transparent resources to develop their FinLLMs. We highlight the importance of an automatic data curation pipeline and the lightweight low-rank adaptation technique in building FinGPT. Furthermore, we showcase several potential applications as stepping stones for users, such as robo-advising, algorithmic trading, and low-code development. Through collaborative efforts within the open-source AI4Finance community, FinGPT aims to stimulate innovation, democratize FinLLMs, and unlock new opportunities in open finance. Two associated code repos are \url{https://github.com/AI4Finance-Foundation/FinGPT} and \url{https://github.com/AI4Finance-Foundation/FinNLP}
翻译:大语言模型(LLMs)已展现出在多种领域中彻底变革自然语言处理任务的潜力,金融领域对此表现出浓厚兴趣。获取高质量金融数据是金融大语言模型(FinLLMs)面临的首要挑战。虽然BloombergGPT等专有模型利用其独特的数据积累优势,但这种特权访问促使我们需要开源替代方案来推动互联网规模金融数据的民主化。本文提出一个面向金融领域的开源大语言模型——FinGPT。与专有模型不同,FinGPT采用数据驱动的方法,为研究人员和从业者提供可获取且透明的资源来开发其FinLLMs。我们强调了自动化数据整理流程和轻量级低秩适应技术在构建FinGPT中的重要性。此外,我们展示了若干潜在应用作为用户入门基础,例如智能投顾、算法交易和低代码开发。通过开源AI4Finance社区的协作努力,FinGPT旨在激发创新、推动FinLLMs民主化,并开启开放金融领域的新机遇。相关代码仓库参见\url{https://github.com/AI4Finance-Foundation/FinGPT}和\url{https://github.com/AI4Finance-Foundation/FinNLP}