The increasing realism of AI-Generated Images (AIGI) has created an urgent need for forensic tools capable of reliably distinguishing synthetic content from authentic imagery. Existing detectors are typically tailored to specific forgery artifacts--such as frequency-domain patterns or semantic inconsistencies--leading to specialized performance and, at times, conflicting judgments. To address these limitations, we present \textbf{AgentFoX}, a Large Language Model-driven framework that redefines AIGI detection as a dynamic, multi-phase analytical process. Our approach employs a quick-integration fusion mechanism guided by a curated knowledge base comprising calibrated Expert Profiles and contextual Clustering Profiles. During inference, the agent begins with high-level semantic assessment, then transitions to fine-grained, context-aware synthesis of signal-level expert evidence, resolving contradictions through structured reasoning. Instead of returning a coarse binary output, AgentFoX produces a detailed, human-readable forensic report that substantiates its verdict, enhancing interpretability and trustworthiness for real-world deployment. Beyond providing a novel detection solution, this work introduces a scalable agentic paradigm that facilitates intelligent integration of future and evolving forensic tools.
翻译:AI生成图像(AIGI)日益逼真的特性,催生了亟需能够可靠区分合成内容与真实影像的取证工具。现有检测器通常针对特定伪造伪影(如频域模式或语义不一致性)设计,导致性能专业化且有时存在相互矛盾的判断。为突破这些局限,我们提出\textbf{AgentFoX}——一种将AIGI检测重新定义为动态多阶段分析过程的大型语言模型驱动框架。该方法采用由精校专家画像与上下文聚类画像构成的策展知识库引导的快速集成融合机制。推理过程中,智能体从高层语义评估入手,转而进行基于上下文感知的细粒度信号级专家证据综合,通过结构化推理消除矛盾。AgentFoX不再输出粗糙的二元结果,而是生成详实的人类可读取证报告以佐证其判决,从而增强实际部署中的可解释性与可信度。除提供新型检测方案外,本工作还引入一种可扩展的智能体范式,促进未来取证工具的智能整合与演进。