Tables, typically two-dimensional and structured to store large amounts of data, are essential in daily activities like database queries, spreadsheet manipulations, web table question answering, and image table information extraction. Automating these table-centric tasks with Large Language Models (LLMs) or Visual Language Models (VLMs) offers significant public benefits, garnering interest from academia and industry. This survey provides a comprehensive overview of table-related tasks, examining both user scenarios and technical aspects. It covers traditional tasks like table question answering as well as emerging fields such as spreadsheet manipulation and table data analysis. We summarize the training techniques for LLMs and VLMs tailored for table processing. Additionally, we discuss prompt engineering, particularly the use of LLM-powered agents, for various table-related tasks. Finally, we highlight several challenges, including processing implicit user intentions and extracting information from various table sources.
翻译:表格通常以二维结构化形式存储大量数据,在日常活动中至关重要,例如数据库查询、电子表格操作、网页表格问答以及图像表格信息提取。利用大型语言模型(LLMs)或视觉语言模型(VLMs)自动化这些以表格为中心的任务具有重要的公共效益,引起了学术界和工业界的广泛关注。本综述全面概述了与表格相关的任务,从用户场景和技术层面进行了系统考察。涵盖传统任务如表格问答,以及新兴领域如电子表格操作和表格数据分析。我们总结了针对表格处理定制的LLMs和VLMs的训练技术。此外,讨论了提示工程,特别是利用LLM驱动的智能体处理各类表格相关任务的方法。最后,我们重点指出了若干挑战,包括处理隐含用户意图以及从多样化表格源中提取信息。