With advancements in deep learning (DL) and computer vision techniques, the field of chart understanding is evolving rapidly. In particular, multimodal large language models (MLLMs) are proving to be efficient and accurate in understanding charts. To accurately measure the performance of MLLMs, the research community has developed multiple datasets to serve as benchmarks. By examining these datasets, we found that they are all limited to a small set of chart types. To bridge this gap, we propose the ChartComplete dataset. The dataset is based on a chart taxonomy borrowed from the visualization community, and it covers thirty different chart types. The dataset is a collection of classified chart images and does not include a learning signal. We present the ChartComplete dataset as is to the community to build upon it.
翻译:随着深度学习(DL)与计算机视觉技术的进步,图表理解领域正在迅速发展。特别是多模态大语言模型(MLLMs)已被证明在理解图表方面既高效又准确。为了精确衡量MLLMs的性能,研究界已开发了多个数据集作为基准。通过审视这些数据集,我们发现它们均局限于一小部分图表类型。为弥补这一空白,我们提出了ChartComplete数据集。该数据集基于可视化领域借鉴的图表分类学,涵盖了三十种不同的图表类型。该数据集是已分类图表图像的集合,不包含学习信号。我们将ChartComplete数据集以现有形式呈现给研究界,以供后续构建与使用。