Charts are a fundamental visualization format for structured data analysis. Enabling end-to-end chart editing according to user intent is of great practical value, yet remains challenging due to the need for both fine-grained control and global structural consistency. Most existing approaches adopt pipeline-based designs, where natural language or code serves as an intermediate representation, limiting their ability to faithfully execute complex edits. We introduce ChartE$^{3}$, an End-to-End Chart Editing benchmark that directly evaluates models without relying on intermediate natural language programs or code-level supervision. ChartE$^{3}$ focuses on two complementary editing dimensions: local editing, which involves fine-grained appearance changes such as font or color adjustments, and global editing, which requires holistic, data-centric transformations including data filtering and trend line addition. ChartE$^{3}$ contains over 1,200 high-quality samples constructed via a well-designed data pipeline with human curation. Each sample is provided as a triplet of a chart image, its underlying code, and a multimodal editing instruction, enabling evaluation from both objective and subjective perspectives. Extensive benchmarking of state-of-the-art multimodal large language models reveals substantial performance gaps, particularly on global editing tasks, highlighting critical limitations in current end-to-end chart editing capabilities.
翻译:图表是结构化数据分析的基本可视化形式。根据用户意图实现端到端的图表编辑具有重要的实用价值,但由于需要同时满足细粒度控制和全局结构一致性,这仍然是一项挑战。现有方法大多采用基于流水线的设计,以自然语言或代码作为中间表示,限制了其忠实执行复杂编辑的能力。我们提出了ChartE$^{3}$,一个端到端图表编辑基准,它无需依赖中间自然语言程序或代码级监督即可直接评估模型。ChartE$^{3}$聚焦于两个互补的编辑维度:局部编辑,涉及字体或颜色调整等细粒度外观变化;以及全局编辑,需要包括数据筛选和趋势线添加在内的整体性、以数据为中心的转换。ChartE$^{3}$包含超过1,200个高质量样本,这些样本通过精心设计的数据流程并辅以人工校验构建而成。每个样本以图表图像、其底层代码和多模态编辑指令的三元组形式提供,支持从客观和主观两个角度进行评估。对最先进的多模态大语言模型的广泛基准测试揭示了显著的性能差距,尤其是在全局编辑任务上,凸显了当前端到端图表编辑能力的关键局限。