From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models

Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making. Automatic chart understanding has witnessed significant advancements with the rise of large foundation models in recent years. Foundation models, such as large language models (LLMs), have revolutionized various natural language processing (NLP) tasks and are increasingly being applied to chart understanding tasks. This survey paper provides a comprehensive overview of the recent developments, challenges, and future directions in chart understanding within the context of these foundation models. The paper begins by defining chart understanding, outlining problem formulations, and discussing fundamental building blocks crucial for studying chart understanding tasks. In the section on tasks and datasets, we explore various tasks within chart understanding and discuss their evaluation metrics and sources of both charts and textual inputs. Modeling strategies are then examined, encompassing both classification-based and generation-based approaches, along with tool augmentation techniques that enhance chart understanding performance. Furthermore, we discuss the state-of-the-art performance of each task and discuss how we can improve the performance. Challenges and future directions are addressed in a dedicated section, highlighting issues such as domain-specific charts, lack of efforts in evaluation, and agent-oriented settings. This survey paper serves to provide valuable insights and directions for future research in chart understanding leveraging large foundation models. The studies mentioned in this paper, along with emerging new research, will be continually updated at: https://github.com/khuangaf/Awesome-Chart-Understanding.

翻译：图表形式的数据可视化在数据分析中发挥着关键作用，提供关键洞察并辅助明智决策。近年来，随着大基础模型的兴起，图表自动理解取得了显著进展。基础模型（如大语言模型）已彻底改变了诸多自然语言处理任务，并越来越多地应用于图表理解任务。本综述论文全面概述了在这些基础模型背景下图表理解的最新进展、挑战和未来方向。论文首先定义了图表理解，阐述了问题形式化，并讨论了研究图表理解任务至关重要的基本构建模块。在任务与数据集部分，我们探讨了图表理解中的各种任务，并讨论了其评估指标以及图表和文本输入的来源。接着审视了建模策略，涵盖基于分类和基于生成的方法，以及增强图表理解性能的工具增强技术。此外，我们讨论了各任务的最新性能水平，并探讨了如何提升性能。挑战与未来方向在专门章节中进行探讨，重点关注领域特定图表、评估工作不足以及面向智能体的设置等问题。本综述论文旨在为利用大基础模型的图表理解未来研究提供宝贵见解和方向。本文提及的研究以及新兴研究将持续更新于：https://github.com/khuangaf/Awesome-Chart-Understanding。