Despite continuous advancements in the capabilities of large language models (LLMs), numerical reasoning remains a challenging area. Techniques like chain-of-thought prompting, tree-of-thought prompting, and program-of-thought prompting guide LLMs through intermediate reasoning steps. Although in-context learning with few-shot prompting has improved performance, LLMs still lag behind state-of-the-art models on financial numerical reasoning datasets such as FinQA and ConvFinQA. In this work, we introduce FINDER, a novel two-step framework, to enhance LLMs' capabilities in financial numerical reasoning. The first step utilizes a generative retriever to extract relevant facts from unstructured data, including both text and tables. This is followed by context-aware Program of Thought prompting with dynamic selection of in-context examples. Our model FINDER achieves a new state-of-the-art performance on both the FinQA and ConvFinQA datasets, surpassing previous benchmarks with execution accuracy improvements of 5.98% and 4.05%, respectively.
翻译:尽管大型语言模型(LLM)的能力持续进步,数值推理仍是一个具有挑战性的领域。思维链提示、思维树提示和思维程序提示等技术通过中间推理步骤引导LLM。尽管基于少量示例提示的上下文学习已提升了性能,但在FinQA和ConvFinQA等金融数值推理数据集上,LLM仍落后于最先进的模型。在本工作中,我们提出了FINDER这一新颖的两步框架,以增强LLM在金融数值推理方面的能力。第一步利用生成式检索器从非结构化数据(包括文本和表格)中提取相关事实,随后采用上下文感知的思维程序提示,并动态选择上下文示例。我们的模型FINDER在FinQA和ConvFinQA数据集上均实现了新的最优性能,分别以5.98%和4.05%的执行准确率提升超越了之前的基准。