Traditionally, traders and quantitative analysts address alpha decay by manually crafting formulaic alphas, mathematical expressions that identify patterns or signals in financial data, through domain expertise and trial-and-error. This process is often time-consuming and difficult to scale. With recent advances in large language models (LLMs), it is now possible to automate the generation of such alphas by leveraging the reasoning capabilities of LLMs. This paper introduces a novel framework that integrates a prompt-based LLM with a Transformer model for stock price prediction. The LLM first generates diverse and adaptive alphas using structured inputs such as historical stock features (Close, Open, High, Low, Volume), technical indicators, sentiment scores of both target and related companies. These alphas, instead of being used directly for trading, are treated as high-level features that capture complex dependencies within the financial data. To evaluate the effectiveness of these LLM-generated formulaic alphas, the alpha features are then fed into prediction models such as Transformer, LSTM, TCN, SVR, and Random Forest to forecast future stock prices. Experimental results demonstrate that the LLM-generated alphas significantly improve predictive accuracy. Moreover, the accompanying natural language reasoning provided by the LLM enhances the interpretability and transparency of the predictions, supporting more informed financial decision-making.
翻译:传统上,交易员与量化分析师通过领域专业知识与试错法手动构建公式化Alpha(即识别金融数据中模式或信号的数学表达式)以应对Alpha衰减问题。这一过程通常耗时且难以扩展。随着大语言模型(LLMs)的最新进展,如今可利用LLMs的推理能力实现此类Alpha的自动生成。本文提出了一种新颖框架,将基于提示的LLM与Transformer模型相结合用于股价预测。该LLM首先利用结构化输入(如历史股票特征(收盘价、开盘价、最高价、最低价、成交量)、技术指标、目标公司及相关公司的情感得分)生成多样化且自适应的Alpha。这些Alpha并非直接用于交易,而是作为捕捉金融数据内复杂依赖关系的高层特征。为评估LLM生成的公式化Alpha的有效性,这些Alpha特征被输入至Transformer、LSTM、TCN、SVR及随机森林等预测模型中以预测未来股价。实验结果表明,LLM生成的Alpha显著提升了预测准确性。此外,LLM提供的伴随自然语言推理增强了预测的可解释性与透明度,有助于实现更明智的金融决策。