Entity-level fine-grained sentiment analysis in the financial domain is a crucial subtask of sentiment analysis and currently faces numerous challenges. The primary challenge stems from the lack of high-quality and large-scale annotated corpora specifically designed for financial text sentiment analysis, which in turn limits the availability of data necessary for developing effective text processing techniques. Recent advancements in large language models (LLMs) have yielded remarkable performance in natural language processing tasks, primarily centered around language pattern matching. In this paper, we propose a novel and extensive Chinese fine-grained financial sentiment analysis dataset, FinChina SA, for enterprise early warning. We thoroughly evaluate and experiment with well-known existing open-source LLMs using our dataset. We firmly believe that our dataset will serve as a valuable resource to advance the exploration of real-world financial sentiment analysis tasks, which should be the focus of future research. Our dataset and all code to replicate the experimental results will be released.
翻译:金融领域中的实体级细粒度情感分析是情感分析的一个重要子任务,目前面临诸多挑战。主要挑战源于缺乏专门针对金融文本情感分析的高质量大规模标注语料库,这进而限制了开发有效文本处理技术所需的数据可用性。近年来,大语言模型(LLMs)的进展在以语言模式匹配为核心的自然语言处理任务中取得了显著性能。在本文中,我们针对企业预警提出了一种新颖且广泛的中文细粒度金融情感分析数据集FinChina SA。我们利用该数据集对现有知名开源大语言模型进行了全面评估与实验。我们坚信,该数据集将成为推动现实金融情感分析任务探索的重要资源,而这应是未来研究的重点。我们的数据集及复现实验结果的完整代码将公开发布。