Vertical bars, horizontal bars, dot, scatter, and line plots provide a diverse set of visualizations to represent data. To understand these plots, one must be able to recognize textual components, locate data points in a plot, and process diverse visual contexts to extract information. In recent works such as Pix2Struct, Matcha, and Deplot, OCR-free chart-to-text translation has achieved state-of-the-art results on visual language tasks. These results outline the importance of chart-derendering as a pre-training objective, yet existing datasets provide a fixed set of training examples. In this paper, we propose GenPlot; a plot generator that can generate billions of additional plots for chart-derendering using synthetic data.
翻译:垂直柱状图、水平柱状图、散点图、气泡图及折线图构成了用于数据表示的多类可视化图表。为理解这些图表,必须能够识别文本组件、定位图表中的数据点,并处理多样的视觉语境以提取信息。在Pix2Struct、Matcha和Deplot等近期研究中,无OCR的图表到文本翻译已在视觉语言任务上取得最先进成果。这些成果凸显了图表反渲染作为预训练目标的重要性,然而现有数据集仅提供固定的训练样本集。本文提出GenPlot——一种图表生成器,可利用合成数据生成数十亿额外图表用于图表反渲染。