Large language models (LLMs) are increasingly applied for tabular tasks using in-context learning. The prompt representation for a table may play a role in the LLMs ability to process the table. Inspired by prior work, we generate a collection of self-supervised structural tasks (e.g. navigate to a cell and row; transpose the table) and evaluate the performance differences when using 8 formats. In contrast to past work, we introduce 8 noise operations inspired by real-world messy data and adversarial inputs, and show that such operations can impact LLM performance across formats for different structural understanding tasks.
翻译:大型语言模型(LLMs)正越来越多地通过上下文学习应用于表格任务。表格的提示表示形式可能影响LLMs处理表格的能力。受先前研究启发,我们生成了一系列自监督结构任务(例如,导航到单元格和行;转置表格),并评估了使用8种格式时的性能差异。与以往工作不同,我们引入了8种受现实世界杂乱数据和对抗性输入启发的噪声操作,并表明这些操作可能在不同格式下对LLMs在各类结构理解任务中的性能产生影响。