Financial narratives from U.S. Securities and Exchange Commission (SEC) filing reports and quarterly earnings call transcripts (ECTs) are very important for investors, auditors, and regulators. However, their length, financial jargon, and nuanced language make fine-grained analysis difficult. Prior sentiment analysis in the financial domain required a large, expensive labeled dataset, making the sentence-level stance towards specific financial targets challenging. In this work, we introduce a sentence-level corpus for stance detection focused on three core financial metrics: debt, earnings per share (EPS), and sales. The sentences were extracted from Form 10-K annual reports and ECTs, and labeled for stance (positive, negative, neutral) using the advanced ChatGPT-o3-pro model under rigorous human validation. Using this corpus, we conduct a systematic evaluation of modern large language models (LLMs) using zero-shot, few-shot, and Chain-of-Thought (CoT) prompting strategies. Our results show that few-shot with CoT prompting performs best compared to supervised baselines, and LLMs' performance varies across the SEC and ECT datasets. Our findings highlight the practical viability of leveraging LLMs for target-specific stance in the financial domain without requiring extensive labeled data.
翻译:美国证券交易委员会(SEC)申报报告和季度财报电话会议记录(ECTs)中的财务叙述对投资者、审计师和监管机构至关重要。然而,其篇幅长度、财务术语及微妙语言表达使得细粒度分析变得困难。先前金融领域的情感分析需要大规模、高成本的标注数据集,导致针对特定财务目标的句子级立场分析颇具挑战性。本研究引入了一个专注于三个核心财务指标——债务、每股收益(EPS)和销售额——的句子级立场检测语料库。该语料库的句子提取自10-K年度报告和ECTs,并采用先进的ChatGPT-o3-pro模型在严格人工验证下进行立场标注(积极、消极、中立)。基于此语料库,我们采用零样本、少样本和思维链(CoT)提示策略对现代大型语言模型(LLMs)进行了系统评估。结果表明:与监督基线相比,少样本结合CoT提示策略表现最佳;且LLMs在SEC和ECT数据集上的性能存在差异。我们的发现凸显了在金融领域利用LLMs进行目标特异性立场分析的实际可行性,而无需依赖大量标注数据。