In the realm of financial analytics, leveraging unstructured data, such as earnings conference calls (ECCs), to forecast stock performance is a critical challenge that has attracted both academics and investors. While previous studies have used deep learning-based models to obtain a general view of ECCs, they often fail to capture detailed, complex information. Our study introduces a novel framework: \textbf{ECC Analyzer}, combining Large Language Models (LLMs) and multi-modal techniques to extract richer, more predictive insights. The model begins by summarizing the transcript's structure and analyzing the speakers' mode and confidence level by detecting variations in tone and pitch for audio. This analysis helps investors form an overview perception of the ECCs. Moreover, this model uses the Retrieval-Augmented Generation (RAG) based methods to meticulously extract the focuses that have a significant impact on stock performance from an expert's perspective, providing a more targeted analysis. The model goes a step further by enriching these extracted focuses with additional layers of analysis, such as sentiment and audio segment features. By integrating these insights, the ECC Analyzer performs multi-task predictions of stock performance, including volatility, value-at-risk (VaR), and return for different intervals. The results show that our model outperforms traditional analytic benchmarks, confirming the effectiveness of using advanced LLM techniques in financial analytics.
翻译:在金融分析领域,利用盈余电话会议(ECCs)等非结构化数据预测股票表现,是吸引学者与投资者关注的关键挑战。尽管已有研究采用深度学习模型获取ECCs的总体观点,但这些方法往往难以捕捉细致复杂的深层信息。本研究提出名为**ECC分析器**的创新框架,融合大语言模型(LLMs)与多模态技术,提取更具预测性的丰富洞察。该模型首先总结会议记录的文本结构,并通过检测音频中的语气与音调变化,分析发言人的表达模式与自信程度,帮助投资者形成对ECCs的全局认知。此外,模型采用基于检索增强生成(RAG)的方法,从专家视角细致提取对股票表现具有重大影响的焦点要素,实现更具针对性的分析。在此基础上,模型进一步通过融入情感分析、音频片段特征等多层分析,丰富这些焦点要素的内涵。通过整合这些洞察,ECC分析器实现多任务股票表现预测,包括不同时间区间的波动率、风险价值(VaR)和收益率。实验结果表明,该模型优于传统分析基准,验证了在金融分析中应用先进大语言模型技术的有效性。