轻量化大型语言模型在异构金融文本数据情感分类中的微调研究 (Fine-tuning of lightweight large language models for sentiment classification on heterogeneous financial textual data)

Large language models (LLMs) play an increasingly important role in finan- cial markets analysis by capturing signals from complex and heterogeneous textual data sources, such as tweets, news articles, reports, and microblogs. However, their performance is dependent on large computational resources and proprietary datasets, which are costly, restricted, and therefore inacces- sible to many researchers and practitioners. To reflect realistic situations we investigate the ability of lightweight open-source LLMs - smaller and publicly available models designed to operate with limited computational resources - to generalize sentiment understanding from financial datasets of varying sizes, sources, formats, and languages. We compare the benchmark finance natural language processing (NLP) model, FinBERT, and three open-source lightweight LLMs, DeepSeek-LLM 7B, Llama3 8B Instruct, and Qwen3 8B on five publicly available datasets: FinancialPhraseBank, Financial Question Answering, Gold News Sentiment, Twitter Sentiment and Chinese Finance Sentiment. We find that LLMs, specially Qwen3 8B and Llama3 8B, perform best in most scenarios, even from using only 5% of the available training data. These results hold in zero-shot and few-shot learning scenarios. Our findings indicate that lightweight, open-source large language models (LLMs) consti- tute a cost-effective option, as they can achieve competitive performance on heterogeneous textual data even when trained on only a limited subset of the extensive annotated corpora that are typically deemed necessary.

翻译：大型语言模型（LLMs）通过从复杂且异构的文本数据源（如推文、新闻文章、报告和微博）中捕捉信号，在金融市场分析中扮演着日益重要的角色。然而，其性能依赖于庞大的计算资源和专有数据集，这些资源成本高昂、受限，因此许多研究人员和实践者难以获取。为反映实际情况，我们研究了轻量化开源LLMs（即设计用于在有限计算资源下运行的、规模较小且公开可用的模型）从不同规模、来源、格式和语言的金融数据集中泛化情感理解的能力。我们比较了基准金融自然语言处理（NLP）模型FinBERT，以及三个开源轻量化LLMs：DeepSeek-LLM 7B、Llama3 8B Instruct和Qwen3 8B，在五个公开数据集上的表现：FinancialPhraseBank、Financial Question Answering、Gold News Sentiment、Twitter Sentiment和Chinese Finance Sentiment。我们发现，LLMs（特别是Qwen3 8B和Llama3 8B）在大多数场景中表现最佳，即使仅使用5%的可用训练数据。这些结果在零样本和少样本学习场景中同样成立。我们的研究结果表明，轻量化开源大型语言模型（LLMs）是一种经济高效的选择，因为它们即使在仅使用通常被认为必要的大量标注语料库的有限子集进行训练时，也能在异构文本数据上实现有竞争力的性能。