Large language models (LLMs) and their fine-tuning techniques have demonstrated superior performance in various language understanding and generation tasks. This paper explores fine-tuning LLMs for stock return forecasting with financial newsflow. In quantitative investing, return forecasting is fundamental for subsequent tasks like stock picking, portfolio optimization, etc. We formulate the model to include text representation and forecasting modules. We propose to compare the encoder-only and decoder-only LLMs, considering they generate text representations in distinct ways. The impact of these different representations on forecasting performance remains an open question. Meanwhile, we compare two simple methods of integrating LLMs' token-level representations into the forecasting module. The experiments on real news and investment universes reveal that: (1) aggregated representations from LLMs' token-level embeddings generally produce return predictions that enhance the performance of long-only and long-short portfolios; (2) in the relatively large investment universe, the decoder LLMs-based prediction model leads to stronger portfolios, whereas in the small universes, there are no consistent winners. Among the three LLMs studied (DeBERTa, Mistral, Llama), Mistral performs more robustly across different universes; (3) return predictions derived from LLMs' text representations are a strong signal for portfolio construction, outperforming conventional sentiment scores.
翻译:大语言模型(LLMs)及其微调技术已在多种语言理解与生成任务中展现出卓越性能。本文探讨了利用金融新闻流微调LLMs进行股票收益率预测的方法。在量化投资中,收益率预测是后续选股、投资组合优化等任务的基础。我们将模型构建为包含文本表征和预测模块的结构。考虑到编码器专用与解码器专用LLMs以不同方式生成文本表征,我们提出对两者进行比较。这些不同表征对预测性能的影响仍是一个开放性问题。同时,我们比较了将LLMs的令牌级表征整合到预测模块中的两种简单方法。在真实新闻与投资域上的实验表明:(1)基于LLMs令牌级嵌入的聚合表征所产生的收益率预测,普遍能提升纯多头和多空投资组合的表现;(2)在相对较大的投资域中,基于解码器LLMs的预测模型能构建出表现更强的投资组合,而在小规模投资域中则没有一致的优胜者。在研究的三种LLMs(DeBERTa、Mistral、Llama)中,Mistral在不同投资域上表现更为稳健;(3)源自LLMs文本表征的收益率预测是投资组合构建的强信号,其表现优于传统情感评分。