With the rapid adoption of large language models (LLMs) in financial service scenarios, dialogue security detection under high regulatory risk presents significant challenges. Existing methods mainly rely on single-dimensional semantic judgments or fixed rules, making them inadequate for handling multi-turn semantic evolution and complex regulatory clauses; moreover, they lack models specifically designed for financial security detection. To address these issues, this paper proposes FinSec, a four-tier security detection framework for financial agent. FinSec enables structured, interpretable, and end-to-end identification of actual financial risks, incorporating suspicious behavior pattern analysis, delayed risk and adversarial inference, semantic security analysis, and integrated risk-based decision-making. Notably, FinSec significantly enhances the robustness of high-risk dialogue detection while maintaining model utility. Experimental results demonstrate FinSec's leading performance. In terms of overall detection capability, FinSec achieves an F1 score of 90.13%, improving upon baseline models by 6--14 percentage points; its ASR is reduced to 9.09%, markedly lowering the probability of unsafe outputs; and the AUPRC increases to 0.9189 -- an approximate 9.7% gain over general frameworks. Additionally, in balancing utility and safety, FinSec obtains a composite score of 0.9098, delivering robust and efficient protection for financial agent dialogues.
翻译:随着大语言模型(LLMs)在金融服务场景中的快速普及,高监管风险下的对话安全检测面临重大挑战。现有方法多依赖单维度语义判断或固定规则,难以应对多轮语义演化与复杂监管条款;同时,缺乏专为金融安全检测设计的模型。针对这些问题,本文提出FinSec——一个面向金融智能体的四层安全检测框架。FinSec融合了可疑行为模式分析、延迟风险与对抗推理、语义安全分析、以及基于风险的综合决策,能够对实际金融风险进行结构化、可解释且端到端的识别。值得注意的是,FinSec在保持模型效用的同时,显著增强了高风险对话检测的鲁棒性。实验结果表明FinSec具有领先性能。在整体检测能力方面,FinSec的F1分数达90.13%,相较基线模型提升6-14个百分点;其ASR降至9.09%,大幅降低了不安全输出的概率;AUPRC提升至0.9189——相较通用框架增长约9.7%。此外,在平衡效用与安全性方面,FinSec获得0.9098的综合得分,为金融智能体对话提供了稳健高效的防护。