Financial multimodal reasoning requires agents to coordinate numerical computation, retrieval, visual interpretation, and temporal grounding across heterogeneous evidence sources. Existing tool-augmented agents improve execution fidelity, yet remain largely stateless across episodes, repeatedly rediscovering reasoning strategies and failure patterns. In high-stakes financial settings, this leads to unreliable tool routing, noisy retrieval, and hallucination-prone reasoning. We present FinAcumen, a financial reasoning agent framework centered on selective experience memory for tool-augmented multimodal reasoning. FinAcumen accumulates financially grounded reasoning experience from prior trajectories, distilling successful strategies and failure-derived cautionary rules into a persistent memory bank. During inference, retrieved experiences condition reasoning only when semantic relevance exceeds a calibrated threshold, while irrelevant memory is explicitly suppressed through a fallback mechanism. A deterministic financial tool environment further grounds numerical computation, retrieval, visual decoding, and answer verification.Across four financial multimodal reasoning benchmarks, FinAcumen consistently improves a frozen 8B vision-language model over finance-specialized models and approaches leading proprietary general-purpose models. Further analysis shows that selective experience activation improves reasoning reliability under retrieval uncertainty. Our code is anonymously available at https://anonymous.4open.science/r/FinAcumen
翻译:金融多模态推理要求智能体在异构证据来源间协调数值计算、信息检索、视觉解读与时序定位。现有工具增强型智能体虽能提升执行保真度,但跨回合对话中仍具有无状态特性,反复发现推理策略与失败模式。在高风险金融场景中,这会导致不可靠的工具路由、含噪检索及易产生幻觉的推理。我们提出FinAcumen——一种面向工具增强型多模态推理的金融推理智能体框架,其核心为选择性经验记忆机制。FinAcumen从先前轨迹中积累经过金融场景验证的推理经验,将成功策略与源自失败的警示规则蒸馏至持久化记忆库。推理阶段,仅当语义相关性超过校准阈值时,检索到的经验才会驱动推理,同时通过回退机制显式抑制无关记忆。确定性金融工具环境进一步支撑数值计算、检索、视觉解码与答案验证。在四个金融多模态推理基准上,FinAcumen使冻结的80亿参数视觉语言模型持续超越金融专用模型,性能接近领先的专有通用模型。进一步分析表明,选择性经验激活机制可提升检索不确定性下的推理可靠性。我们的代码匿名发布于https://anonymous.4open.science/r/FinAcumen。