Multimodal large language models (MLLMs) are increasingly used to automate chart generation from data tables, enabling efficient data analysis and reporting but also introducing new misuse risks. In this work, we introduce ChartAttack, a novel framework for evaluating how MLLMs can be misused to generate misleading charts at scale. ChartAttack injects misleaders into chart designs, aiming to induce incorrect interpretations of the underlying data. Furthermore, we create AttackViz, a chart question-answering (QA) dataset where each (chart specification, QA) pair is labeled with effective misleaders and their induced incorrect answers. Experiments in in-domain and cross-domain settings show that ChartAttack significantly degrades the QA performance of MLLM readers, reducing accuracy by an average of 19.6 points and 14.9 points, respectively. A human study further shows an average 20.2 point drop in accuracy for participants exposed to misleading charts generated by ChartAttack. Our findings highlight an urgent need for robustness and security considerations in the design, evaluation, and deployment of MLLM-based chart generation systems. We make our code and data publicly available.
翻译:多模态大语言模型(MLLMs)正日益广泛地用于从数据表自动生成图表,这虽然提升了数据分析和报告的效率,但也引入了新的滥用风险。本文提出ChartAttack,一种新颖的评估框架,用于探究MLLMs如何被大规模滥用以生成误导性图表。ChartAttack通过向图表设计中注入误导因子,旨在诱导对底层数据做出错误解读。此外,我们构建了AttackViz——一个图表问答(QA)数据集,其中每个(图表规范,QA)对均标注了有效的误导因子及其所引发的错误答案。在领域内和跨领域设置下的实验表明,ChartAttack能显著降低MLLM阅读器的QA性能,分别使准确率平均下降19.6分和14.9分。一项人工研究进一步显示,接触由ChartAttack生成的误导图表的参与者,其准确率平均下降20.2分。我们的研究结果凸显了在基于MLLM的图表生成系统的设计、评估与部署中,亟需考虑鲁棒性与安全性。我们已公开代码与数据。