Financial sentiment analysis plays a crucial role in uncovering latent patterns and detecting emerging trends, enabling individuals to make well-informed decisions that may yield substantial advantages within the constantly changing realm of finance. Recently, Large Language Models (LLMs) have demonstrated their effectiveness in diverse domains, showcasing remarkable capabilities even in zero-shot and few-shot in-context learning for various Natural Language Processing (NLP) tasks. Nevertheless, their potential and applicability in the context of financial sentiment analysis have not been thoroughly explored yet. To bridge this gap, we employ two approaches: in-context learning (with a focus on gpt-3.5-turbo model) and fine-tuning LLMs on a finance-domain dataset. Given the computational costs associated with fine-tuning LLMs with large parameter sizes, our focus lies on smaller LLMs, spanning from 250M to 3B parameters for fine-tuning. We then compare the performances with state-of-the-art results to evaluate their effectiveness in the finance-domain. Our results demonstrate that fine-tuned smaller LLMs can achieve comparable performance to state-of-the-art fine-tuned LLMs, even with models having fewer parameters and a smaller training dataset. Additionally, the zero-shot and one-shot performance of LLMs produces comparable results with fine-tuned smaller LLMs and state-of-the-art outcomes. Furthermore, our analysis demonstrates that there is no observed enhancement in performance for finance-domain sentiment analysis when the number of shots for in-context learning is increased.
翻译:金融情感分析在揭示潜在模式、检测新兴趋势方面发挥着关键作用,能够帮助个体在瞬息万变的金融领域做出明智决策,从而获得显著优势。近年来,大型语言模型(LLMs)已在多个领域展现出有效性,甚至在各类自然语言处理(NLP)任务的零样本和少样本上下文学习中表现出卓越能力。然而,其在金融情感分析中的潜力和适用性尚未得到充分探索。为填补这一空白,我们采用两种方法:上下文学习(重点使用gpt-3.5-turbo模型)以及在金融领域数据集上微调LLMs。考虑到微调参数规模较大的LLMs需要高昂计算成本,我们专注于对参数量在2.5亿至30亿之间的小型LLMs进行微调。随后,我们将性能与当前最优结果进行比较,以评估其在金融领域的有效性。实验结果表明,即使采用参数更少、训练数据更小的模型,微调后的小型LLMs也能达到与当前最优微调LLMs相媲美的性能。此外,LLMs的零样本和一样本性能同样可与微调后的小型LLMs及当前最优结果相媲美。值得注意的是,我们的分析显示,在金融领域情感分析中,增加上下文学习的样本数量并未带来性能提升。