Entity-level fine-grained sentiment analysis in the financial domain is a crucial subtask of sentiment analysis and currently faces numerous challenges. The primary challenge stems from the lack of high-quality and large-scale annotated corpora specifically designed for financial text sentiment analysis, which in turn limits the availability of data necessary for developing effective text processing techniques. Recent advancements in large language models (LLMs) have yielded remarkable performance in natural language processing tasks, primarily centered around language pattern matching. In this paper, we propose a novel and extensive Chinese fine-grained financial sentiment analysis dataset, FinChina SA, for enterprise early warning. We thoroughly evaluate and experiment with well-known existing open-source LLMs using our dataset. We firmly believe that our dataset will serve as a valuable resource to advance the exploration of real-world financial sentiment analysis tasks, which should be the focus of future research. Our dataset and all code to replicate the experimental results will be released.
翻译:在金融领域,实体级细粒度情感分析是情感分析的一个关键子任务,目前面临诸多挑战。主要挑战源于缺乏针对金融文本情感分析的高质量大规模标注语料库,进而限制了开发有效文本处理技术所需数据的可用性。近年来,大语言模型(LLMs)的进展在自然语言处理任务中取得了显著性能,主要围绕语言模式匹配展开。本文针对企业预警场景,提出并构建了一个新颖且全面的中文细粒度金融情感分析数据集——FinChina SA。我们利用该数据集对现有知名的开源大语言模型进行了全面评估与实验。我们坚信,该数据集将成为推动真实金融情感分析任务探索的宝贵资源,这应成为未来研究的重点。我们的数据集及复现实验结果的全部代码将公开发布。