Despite the success of Vision-Language Models (VLMs), misleading charts remain a significant challenge due to their deceptive visual structures and distorted data representations. We present ChartCynics, an agentic dual-path framework designed to unmask visual deception via a "skeptical" reasoning paradigm. Unlike holistic models, ChartCynics decouples perception from verification: a Diagnostic Vision Path captures structural anomalies (e.g., inverted axes) through strategic ROI cropping, while an OCR-Driven Data Path ensures numerical grounding. To resolve cross-modal conflicts, we introduce an Agentic Summarizer optimized via a two-stage protocol: Oracle-Informed SFT for reasoning distillation and Deception-Aware GRPO for adversarial alignment. This pipeline effectively penalizes visual traps and enforces logical consistency. Evaluations on two benchmarks show that ChartCynics achieves 74.43% and 64.55% accuracy, providing an absolute performance boost of ~29% over the Qwen3-VL-8B backbone, outperforming state-of-the-art proprietary models. Our results demonstrate that specialized agentic workflows can grant smaller open-source models superior robustness, establishing a new foundation for trustworthy chart interpretation.
翻译:尽管视觉语言模型(VLM)已取得成功,但误导性图表因其欺骗性视觉结构与扭曲的数据表征仍构成重大挑战。我们提出ChartCynics——一种通过“怀疑性”推理范式揭露视觉欺骗的智能体双路径框架。与整体模型不同,ChartCynics将感知与验证解耦:诊断视觉路径通过战略性ROI裁剪捕获结构异常(如颠倒坐标轴),而OCR驱动数据路径确保数值立足点。为解决跨模态冲突,我们引入经两阶段协议优化的智能体摘要器:神谕引导SFT用于推理蒸馏,欺骗感知GRPO用于对抗对齐。该流水线有效惩罚视觉陷阱并强制逻辑一致性。在两个基准测试上的评估显示,ChartCynics分别达到74.43%和64.55%的准确率,相较Qwen3-VL-8B主干网络实现约29%的绝对性能提升,超越最先进的专有模型。结果表明,专门的智能体工作流可使小型开源模型获得更强鲁棒性,为可信图表解读奠定新基础。