Tables are crucial containers of information, but understanding their meaning may be challenging. Over the years, there has been a surge in interest in data-driven approaches based on deep learning that have increasingly been combined with heuristic-based ones. In the last period, the advent of \acf{llms} has led to a new category of approaches for table annotation. However, these approaches have not been consistently evaluated on a common ground, making evaluation and comparison difficult. This work proposes an extensive evaluation of four STI SOTA approaches: Alligator (formerly s-elbat), Dagobah, TURL, and TableLlama; the first two belong to the family of heuristic-based algorithms, while the others are respectively encoder-only and decoder-only Large Language Models (LLMs). We also include in the evaluation both GPT-4o and GPT-4o-mini, since they excel in various public benchmarks. The primary objective is to measure the ability of these approaches to solve the entity disambiguation task with respect to both the performance achieved on a common-ground evaluation setting and the computational and cost requirements involved, with the ultimate aim of charting new research paths in the field.
翻译:表格作为信息的关键载体,其语义理解往往面临挑战。近年来,基于深度学习的数据驱动方法日益受到关注,并逐渐与基于启发式的方法相结合。近期,随着大型语言模型(LLMs)的出现,表格标注领域涌现出一类全新的技术路径。然而,这些方法尚未在统一基准下进行系统评估,导致性能对比与评估存在困难。本研究对四种最先进的语义表解释(STI)方法展开全面评估:Alligator(原s-elbat)、Dagobah、TURL与TableLlama;前两者属于基于启发式的算法体系,后两者分别为仅编码器架构与仅解码器架构的大型语言模型。鉴于GPT-4o与GPT-4o-mini在多项公开基准测试中的卓越表现,本次评估亦将其纳入考察范围。核心研究目标在于:通过构建统一评估框架,系统衡量上述方法在实体消歧任务中的性能表现,同时综合分析其计算资源需求与经济成本,最终为领域内新研究方向的探索提供路线图。