With advancements in deep learning (DL) and computer vision techniques, the field of chart understanding is evolving rapidly. In particular, multimodal large language models (MLLMs) are proving to be efficient and accurate in understanding charts. To accurately measure the performance of MLLMs, the research community has developed multiple datasets to serve as benchmarks. By examining these datasets, we found that they are all limited to a small set of chart types. To bridge this gap, we propose the ChartComplete dataset. The dataset is based on a chart taxonomy borrowed from the visualization community, and it covers thirty different chart types. The dataset is a collection of classified chart images and does not include a learning signal. We present the ChartComplete dataset as is to the community to build upon it.
翻译:随着深度学习(DL)与计算机视觉技术的进步,图表理解领域正在迅速发展。特别是,多模态大语言模型(MLLMs)在理解图表方面已被证明是高效且准确的。为了准确衡量MLLMs的性能,研究界已开发了多个数据集作为基准。通过检视这些数据集,我们发现它们均局限于一小部分图表类型。为弥补这一差距,我们提出了ChartComplete数据集。该数据集基于借鉴自可视化领域的图表分类学,涵盖了三十种不同的图表类型。该数据集是已分类图表图像的集合,不包含学习信号。我们将ChartComplete数据集以当前形态呈现给研究社区,以供后续构建与发展。